About Athena Search

The Athena Search system is designed to allow MSI users to search the large (6+ GB) database of CAC call history that has accumulated over the last decade.  The CAC support software has been unable to search for text within the body of call history for a number of years due to the limitations of SQL text searches within huge Text blob structures.  Due to inherent SQL Server limitations memory and temporary file resources are exceeded and the server crashes during a text search.  The benefit of searching past calls for solutions is quite obvious.

The Athena system is built upon Natural Language Processing (NLP) techniques developed by Artificial Ingenuity, LLC, as described by the "Theseus Technology".  This is a word recognition system based upon "fuzzy logic" principles, which allows a flexible way to find the right data despite spelling or grammatical errors, and circumvents the SQL architectural limitations for text searching.

Using Athena Search Remote

The Athena Search software is not difficult to use if you are familiar with Google, Yahoo, or other Web search engines.  The principles behind the Athena system are not unlike those implemented in Google and other search engines.  The basic principle is simply finding occurrences of a phrase or individual words within the large amount of text your wish to search.  The challenge is doing this in a timely fashion while searching millions or billions of words in an unformatted text archive.

The Athena Search window looks like this:

The components of the user interface are as follows:

Search Text:

The Search Text edit is where you specify the words or phrase for which you wish to search in the call history database.  The phrase or list of words you use should be as specific as possible to minimize the search time.  This means that searching for "error" will return thousands of matches and might take 5 or more minutes to run, whereas searching for "error 54" will return much more specific calls and only take about 15 seconds to execute.  The more generic or common the search terms the longer the search will take and the less likely it will only be the calls you are interested in.  The more specific the search terms you use the faster the search will be and the more likely it will be that you retrieve the calls you are interested in.

Search Button:

Click this button when you want to search for the terms specified in the Search Text edit.  You may also press "Enter" while the Search Text edit is in focus for the same result as clicking the Search Button.

Phrase Search check box:

If this box is checked (default) the search results will be limited to only calls that contain the words in the Search Text edit as a complete phrase.  This means that if the Search Text edit contains the words "error during EOD" will match this sentence: "The WinSAMICU took an error during EOD process last night.", but will not match this sentence: "Last night's EOD took an error during the third phase".  If the Phrase Search check box is not checked both sentences will match since they contain all three search words.

Searching for a complete phrase is faster than searching for a non-phrase match, so if possible try to come up with distinct phrases if you have a good idea of what you are looking for.  If you want to match more calls or are not exactly sure of a specific phrase you can un-check this box, but be patient if it takes longer to search!  The search time is a function of the number of matching calls being evaluated.

Exclude Common Words check box:

If this box is checked (default), the search algorithm will attempt to ignore the use of words that are in the top 1000 common words found in the database.  This will often times speed up the search process if common words are included in a search phrase.  If the search phrase contains only common words this check box has no effect.

Unique Tokens check box:

If this box is checked (default), the search algorithm will ignore duplicate uses of a given word in the search phrase.  This will speed up the search process in certain cases.

Max. Results spin edit:

This spin edit control specifies the maximum number of calls that are returned for the search operation.  Search matches are given an internal ranking number by the quality and quantity of matches to the search terms specified.  If you use the default setting of 10 match results that means that you will be returned the top 10 matches per the ranking system.  That should translate to the 10 best matches.  The higher you set this number the longer the searching and ranking will take.  If there is a really large number of matches to a given search term this can take quite a long time.  Be careful with how you use this setting.

Mode radio button:

You can ignore this setting unless your network administrator gives you further instructions.  This determines whether you are connecting to a local or remote Athena Server.  This option may be disabled on your system.

Call list:

This text grid contains the matching calls that were returned by the last search operation.  And example would look something like this:

The by default is sorted by "Rank", which is an internal scoring system that should be the degree to which the call matches the search terms.

You may sort the returned list by clicking the header for any of the columns, and the list will then be sorted in ascending order based on that column.  The "Rank" column is the exception to this and will be sorted in descending order (highest to lowest).

Double clicking on a specific call will load the call contents into the memo field as described below.

Call contents:

This memo field is the actual text contents of the call that you select for display.  This selection is accomplished by double clicking on a specific call record in the "Call list" as described above.  The displayed call contents will include red highlighting of the search terms specified that lead to the match.  This will facilitate locating the specific portion of the text you are interested in.

Below is an example displayed call:

 

Technical Details of Athena System

Multi-Systems, Inc.

 

Return Theseus Page


Return to Research


Home


Contact info@artificialingenuity.com
Copyright © 2005 Artificial Ingenuity, LLC
Last modified: June 28, 2005
Initial design by Webinizer, LLC