Pathways to SEASR Workshop 2009 - Day 1 - Textual analysis with UIMA and SEASR

One of the most interesting examples today was the textual analysis exercise using UIMA and SEASR.  Mike Haberman presented some tools for doing analyzes & visualizations of sentiment in a book, using thesauri and a controlled vocabulary of emotions.  It isn’t clear where UIMA ends and SEASR begins.  My first question was, what is SEASR doing that one couldn’t - or wouldn’t want to - do with existing tools?  Since then, I’ve had some time to play with the Meandre Workbench, which is the software that drives SEASR tools.  It seems obvious now: the issue isn’t what SEASR tools can do, it is how readily the activities of scholarship can be shared and reused in other contexts.  Raw notes from the UIMA portion:
Examples:  SEASR and UIMA:

UIMA: Unstructured Information Management Applications

Example: Use UIMA to analyze Part-of-speech information.
•    Uses CAS (Common Analysis Structure) to serialize data and pass it from analytical structure to analytical structure.   Creates a chain of analytical components.
•    UIMA is accessed through Eclipse as a plugin:

o    XML description of the UIMA chain that will be run.
o    Choose document analyzer
o    Choose directory of files to analyze.

•    Result is original datasource (i.e. text) + annotated parameters (parts of speech/terms/names, etc.)

SEASR application:
•    Pattern analysis – discover how characters in a novel are mentioned in relation to commonly occurring nouns.
•    Problem: handling sparse data sets – many possible nouns, very few in a given sentence.
•    Result: create a matrix of characters and nouns, with confidence ranking.

Example: Use UIMA to analyze sentiment information.
•    Look at adjectives within body of text.
•    Use a thesaurus to chart path from an adjective to one of a predetermined set of emotional terms.
•    Rank the closeness (number of steps) of adjectives to a given emotional term.
•    Problems: different thesauri result in very different analyses.
•    SEASR visualization based on open source ActionScript library: Flare.

Leave a Comment

You must be logged in to post a comment.