We accept
ichecklogo



Home

About us

Services

Software

More Info

IDKSM Logo  IDKSM Search Engine

Forget the explaination just give me the file: download it here.
Wait a minute let me see a demo first.

Now in:
Spanish - Danish - Dutch - French - Slovenian

What is it?

The IDKSM* Search Engine is a full text search engine that was designed for use on CD-ROMs and Intranets. The engine consists of 2 parts.

  1. The Indexer which is used to process the HTML files and place the output into a database.
  2. The runtime engine which is used to read that database and present at hit list that matches the keywords you are searching for.

Currently 4 version of the runtime engine exists.

  1. One written in Java 1.0.2 which has been tested and works on the following:
    • Netscape 3.0
    • Netscape 4.x
    • Internet Explorer 3.0
    • Internet Explorer 4.0
    • HotJava 1.0
  2. A second version of the engine is written for use in Delphi. It consists of a simple .dcu file that you just need to compile along with the rest of your code. Delphi 1 & 3 version are available.
  3. The third version is a DLL. A 16-bit and 32-bit DLL are included. Any applications that can use a DLL can now include a Full Text Search Engine.
  4. The fourth version consists of a Windows Web Server CGI application and a Java Servlet. The Windows CGI should work with any Windows based web server. The Servlet has been tested with Apache but again it should work with any web server that deals with servlets.

Sample code for each of the engines, except the CGI versions, is included in the release.

The Indexer

IDKSM Snapshot
Fig. 1 - The Indexer Window

The Indexer is the shareware product. The version you find in the release is fully functional but it will only process 10 files. Once you register the product you can process as many files as you wish. I have successfully created projects with 2,000 html files. Registration is a simple matter, see the bottom of the page for instructions.

As of the 1.3 release you can now index html and ASCII files. You can also instruct the indexer to link to any file. This means that you can actually index the content from ANY type of file and have the database point to the actual file that has that content. For example. You could take the narration found in an audio file an place that in a text file. Run the Indexer against the text file but have it point to the audio file. When the user does a search for specific content the audio file will be indicated as having that content and the user can then elect to listen to the file. Any file type that a browser can handle the indexer easily work with.

Delphi Runtime Engine

IDKSM Snapshot
Fig. 2 - Sample Delphi Search Engine Dialog

Figure 2 represents a sample of how the Delphi Search Engine Dialog can look. This is included in the release. You can modify this or use it in any fashion you wish. The engine uses standard boolean logic and now understands the wildcard character "*". See the ToDo list in the release for future features. After the user enters the words they care to search on they select the "Search Button". If the search is successful at hit list of valid files appears along with the count of how many documents had those words. Double clicking the line item or selecting it and clicking "Go" will process the information in the manner that you choose.

Java Runtime Engine

IDKSM Snapshot
Fig. 3 - Java Applet Search Engine Window

The Java engine is written in 1.0.2. The reason for this was so that a larger number of browsers would be able to use the product. It was also written so that it doesn't require the applet to be trusted. It can work off of a CD-ROM, Hard Drive, Intranet. Due to its design it requires the use of a CGI application for it to be used over the internet. (see below).

The applet window above accepts several parameters that modify the title and indicate how the results are handled. It is assumed that it is running in a frames environment but it isn't necessary. The design lends itself to a frames environment. If you place the applet button on a frame that will always be available it means that the engine is loaded upon start-up and is always accessable and fast. It also means that any hit list that you create stays around until you ask for another one. This means the user can keep trying out the different files the hit list found without generating a new hit list. Since the search information is in a separate applet window it can be closed and opened over and over again without losing the data. Several parameters are passed to the applet to allow this to work. The sample code details the information that needs to be passed.

With the 1.2 release the Java interface as seen in Fig. 3 has been separated from the Search Engine itself. The code that was used to create the interface above has been included with the release and this now allows you to write any kind of interface that you would like. It also allows the Search Engine to be used in a standard Java Applications as well.

DLL Runtime Engine

Both 16-bit and 32-bit version of the DLL have been included in the release. This allows you to write any interface to the Engine that you so desire in any language you wish just so long as it can access a DLL. Examples in Delphi and Visual Basic have been supplied in the release.

CGI Runtime Engine

Two CGI versions are available since the 1.3 release. The one is a Windows based web server CGI application. It should work with any standard Windows web server. The other version is a Java Servlet. It has been tested with Apache and should work with any web server that can handle servlets. In fact the demos found on this page use the servlet version running on Apache. Both of these CGI applications talk to and work with the same Java Applet that you use for a CD-ROM setup. The help documentation gives you some input as to how to configure your Applet call to use the CGI applications.

Features included in Version 1.4

  1. The Indexer works in batch mode.
    Command line arguments accepted by the Indexer.

    <path>IDKSM Indexer.exe <path>[project] autorun
    1. The first parameter is of course the project file that contains all the information required.
    2. The second parameter is "autorun" which is the command telling it to, well as it says, autorun.
  2. The Indexer now uses a project file which stores your options and file selections for a given job. The file type .idksm is registered now and links you to the Indexer. This allows you to set options and files on a project by project bases. It also makes it easier for batch processing because now you only need to list the project file to open along with the autorun command.
  3. Color modification. You can now change the applets background, foreground, button background and button foreground colors via param statements in the applet call.
  4. A bug was fixed where in the last word processed was dropped and not added to the dictionary.
  5. A new tag is being recognized. It is <IDKSM ignore> </IDKSM>. In some cases users want to index a file but do not want certain sections of the file to be indexed. By placing the section you DO NOT WANT indexed between these tags it will not be processed by the Indexer.
  6. CD-ROM launch program. I have included a program called launch.exe with this distribution. In some cases users would like to have their CD use autorun to open up their index.html file. However, autorun will only work with .exe files. You can use the launch program to make the CD autorun your html content. In you autorun.inf file type open=launch.exe index.htm This will open the users browser and load the .htm file you indicate. Be sure to use paths if needed.
  7. A bug was fixed in the Delphi Engine. This bug would also exists in the Windows CGI application as well as the DLL. Be sure to use the new versions.
  8. We now have a list of default Spanish stopwords to go along with the Indexer.
  9. <META> tags are recognized. In perticular one designed specifically for use by IDKSM. It is <META name=IDKSMTitle content="[your title]">. In some cases users don't have the option of giving specific Titles using the <Title> tag. It can also be inconvient to modify a large filelist.txt file to add the T: option with a different title. This <META> tag lets you add a title just for use by IDKSM directly into the HTML content.
  10. The <META name=keywords content=""> tag is also honored. Currently the keywords are added to the dictionary with the rest of the body text. In future versions the keywords and body content will be keep separate so that you can search on one or the other. For now if you only want to use keywords in your search engine you can use the <IDKSM ignore> tag on the body of the file and then only the keywords will be included in the dictionary.
  11. The Indexer now does the majority of its work from the hard drive. Previously the Indexer tried to do most of the work dynamically in RAM. Some users have a HUGE number of files that contain alot of content. It would require the users to have HUGE amounts of RAM to operate. Now the Indexer does its sorting and word access directly on the hard drive. The downside of this will be longer processing times for large databases. It's a trade off.
  12. You can now specify a complete URL: for the F: options tag in the filelist.txt file. This means that you can index files from several machines or hosts and collect all of the content into one database. Then by giving the complete URL, ex.
    • F:http://www.miraclec.com/software.html

    the file can be accessed on this other host. This URL can be an HTTP:, FTP:, FILE: or any kind of valid URL. Please note that in testing it was observed that IE didn't work exactly right when using the FILE: URL. Talk to me if this is something you want to use.

  13. A bug was till hanging around from earlier version of the engine that limited file searches to less the 3,200. This has been removed.
  14. A bug was found that involved word stemming when using the "*" character. It didn't happen all the time but was dependent on word order in the dictionary. This has been fixed.
  15. Another hold over from an earlier version was found. This wasn't a bug it was a feature that has been removed. It forced the selected file to always be converted to lower case. This would cause files with upper case letters to not be found.
  16. The interface now honors a double-click in the list box. You can now double-click on your selection and no longer need to use the "Go" button.
  17. The interface is now multi-language enabled. It can now easily support alternate languages and already has Spanish and Danish included. A new parameter has been included to the applet call.
    • <param name=language value=en> {for english}
    • <param name=language value=sp> {for spanish}
    • <param name=language value=da> {for danish}
    • <param name=language value=da> {for dutch}
    • <param name=language value=fr> {for french}
    • <param name=language value=sl> {for slovenian}

    The interface defaults to english if the parameter isn't supplied.

  18. The IE HTTP 1.1 bug has been worked around. All versions of IE should now work with the CGI versions of IDKSM.
  19. A bug, where &nbsp; (non-breaking space) wasn't honored correctly, has been fixed.
  20. All indexed words have a size limit of 27 characters. The previous sorting routine handled words larger then this. The new sorting routine didn't and would trash the database files. This has been corrected.
  21. The Indexer and various runtime time engines now will recognize and work with extended characters. Previously extended characters, which are those characters other than 0-9, a-z, and A-Z, where not handled correctly. As of this release you can index and search with these characters.
  22. Error codes are not available in both the Delphi and DLL versions of the runtime engines.
  23. Possibly one or two other items I can't recall right now. :-)

Is there are feature you would like to see? Write me and tell me about it. Maybe it is on our todo list or we can add it.

Demo

I have combined all the demos into one page. On this one page you can see example of the applet in different languages, colors, and as an embedded applet. To see the demo your browser must have Java enabled.

Demo - Due to the hacker attach the demo is currently offline.

Download

Version 1.4 of the Search Engine has been released and can be found on the FTP site. Until the the 1.4 version is more widely available I will retain the links to the 1.3 version.

You can download the file from several sites. If one doesn't work well for you please try another one.

Sites where the Ver. 1.4 file is available: (1,454,103 bytes)

  1. Miracle Concepts - FTP
  2. Torry's Delphi Pages

Don't forget to check the Update section.

Call for foreign language assistance:

I would like to release versions of the Runtime Engine in different languages. However, not being fluent in different languages I need your help. I would appreciate anyone giving me information as to the comparable names of buttons and text in different languages. Send me an email and I can send you a list of the exact words and phrases I need translated. -- Thank you.

Updates

Mailing Lists

Two mailing lists have been setup to support the IDKSM Search Engine. The first list will be used to announce new versions and features. The second list is used to report bugs, request new features and a place for users to share problems and ideas. To subscribe to the first mailing list for announcments send email to:

idksm-announce-request@miraclec.com

and place the word subscribe in the subject line. To subscribe to the second mailing list for user feedback and to report bugs send email to:

idksm-request@miraclec.com

and place the word subscribe in the subject line.

NOTE: Please do not add any additional text to the body of a subscription request because the email is not read by anyone. The subscription is processed automatically. Only send mail to the idksm@miraclec.com address

To get more help information on how to subscribe and unsubscribe to either of these lists, send email to:

idksm-announce-request@miraclec.com
or
idksm-request@miraclec.com

and place the word help in the subject line.

Once you have subscribed to the mailing list you can send email to the mailing second list by using the address:

idksm@miraclec.com

Registration

The price of IDKSM is currently $50. This gets you a completely functional Indexer and the rights to distribute royalty free as many copies of the Runtime Engines as you wish. The Indexer is not to be distributed.

After you pay your registration fee I will send you a key to unlock the Indexer.

Payment method:

You have 3 methods to make a payment. Secure online payment with a credit card, i-check online bank check, or snail mail check payment.

Please note that our store uses Javascript so it must be enabled on your browser to go shopping. The store also uses cookies to store your shopping cart information. No other information is collected or stored. A secure server is used to collect your credit card information so your transactions are completely safe.

Go Shopping

If you are still uncomfortable about using your credit card on-line you can Fax me the required information.

FAX: 570-388-6101

or send a check by mail:

Amount:

$50.00/copy

Pay to the order of:

Miracle Concepts, Inc.

Mail to:

Miracle Concepts, Inc.
74 Hex Street (Harding)
West Pittston, PA 18643-9615

Note:

Please include a valid email address so that I know where to send the registration key.

*IDKSM - I Don't Know Search Me

CafePress

Home | About us | Services | Software
More Info | FTP

Copyright © 1998-2004, Miracle Concepts, Inc., All Rights Reserved

Contact us

Last update: April 5, 2004