Inxight ThingFinder

Leading Entity Extraction in Over 30 Languages

Attensity Government Systems is proud to offer Inxight ThingFinder, which provides advanced text analysis technology that automatically identifies and extracts key entities or other "things" from any text data source, in multiple languages, with no setup or manual creation of rules required.

Out of the box, Inxight ThingFinder automatically identifies and extracts more than 35 key entities - such as people, dates, places, companies or other things - from any text data source, in multiple languages. This ability to automatically identify and classify relevant entities makes ThingFinder one of the most powerful text analysis and extraction tools on the market. Using Inxight ThingFinder, developers can maximize and extend the value of their applications by enabling end-users to quickly find the most important pieces of information within large volumes of documents.

And, by combining Inxight ThingFinder with Attensity Triples, you get the best of both worlds - "nouns and verbs" for complete automated extraction of entities, relations, and events, and allowing you to use contextual information to disambiguate entities.

ThingFinder can be integrated into virtually any application that processes textual information, enabling users to create relevant, meaningful structured data from unstructured data to mine large volumes of text for relevant information and quickly identify trends in data sets, including monitoring trends and movements associated with people, places, dates, organizations, etc.

Features

Extraction and Classification
ThingFinder leverages Inxight's core understanding of natural language processing - language-aware tokenization, part-of-speech tagging and noun phrase identification - to automatically extract and classify all entities.

Variant Identification and Grouping
Variant identification and grouping allow ThingFinder to accurately classify all relevant entities in a document, even one-word entities, and to provide true counts reflecting the number and location of ALL appearances of a given entity. For example, ThingFinder recognizes that the appearance of the word "Smith" in the example below refers to the earlier identified person "Joe Smith."

Normalization
Normalization takes much of the guesswork out of metadata creation, search, data mining and link analysis processes by creating standard formats (e.g., ISO) for certain entity categories such as dates or measurements.

Relevance Ranking
The entities extracted by ThingFinder are given relevance scores reflecting their importance to the document as a whole, making ThingFinder an essential part of any data categorization solution.

Languages Supported
Arabic, Bokmål, Catalan, Croatian, Czech, Danish, Dutch, English, Farsi, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Korean, Nynorsk, Polish, Portuguese, Romanian, Russian, Serbian, Simplified Chinese, Slovak, Slovenian, Spanish, Swedish, Thai, Traditional Chinese and Turkish.

Entity Types Supported
Pre-defined entity types vary by language module. For example, the English language module includes:
ADDRESS, CITY, CONTINENT, COUNTRY, CURRENCY, DATE, DAY,
DISTRICT, FACILITY, FEDERATION, HOLIDAY, MEASURE, MONTH,
NOUN_GROUP, ORGANIZATION, PEOPLE, PERCENT, PERSON, PHONE, PLACE_OTHER, PLACE_REGION, POSITION PRODUCT, PROP_MISC, SPECIAL, SSN, STATE_PROVINCE, TICKER, TIME, TIME_PERIOD, URI, and YEAR.

Sub-entities and sub-types are supported for ADDRESS, CITY, DATE, FACILITY, ORGANIZATION, PLACE_OTHER, PLACE_REGION, URI, COMMON_FACILITY, COMMON_ORGANIZATION, COMMON_PERSON, COMMON_PLACE_OTHER, and COMMON_PLACE_REGION.