Re: SearchBlox J2EE Search Component Version 1.1 released
On Tuesday 02 December 2003 09:51, Tun Lin wrote: > Anyone knows a search engine that supports xml formats? There's no way to generally "support xml formats", as xml is just a meta-language. However, building specific search engines using Lucene core it should be reasonably straight-forward to implement more accurate xml-structure-aware tokenization for specific xml applications like DocBook or other domain-specific apps. So, if any search engine advertises "indexing xml content", one better read the fine print to learn what they really claim. It might be interesting to create a Lucene plug-in that, given a specification of how sub trees under specific elements, would tokenize and index content into separate fields. Plus implementation shouldn't be very difficult -- just use standard XML parser (SAX, DOM) -- and then match xpaths, feed that to analyzer and then add to index. This could also be used for HTML (pre-filtering with JTidy or similar first to get to xml-compliant HTML). I wouldn't be surprised if someone on list has already done this? -+ Tatu +- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: SearchBlox J2EE Search Component Version 1.1 released
Tun Lin wrote: Anyone knows a search engine that supports xml formats? http://jakarta.apache.org/lucene/docs/lucene-sandbox/ see SAX/ DOM XML demo. -- open technology: www.media-style.com open source: www.weta-group.net open discussion: www.text-mining.org - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
Anyone knows a search engine that supports xml formats? -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 03, 2003 12:36 AM To: Lucene Users List Subject: Re: SearchBlox J2EE Search Component Version 1.1 released No. The formats supported by SearchBlox given here : http://www.searchblox.com/faqs/question.php?qstId=5 Tun Lin wrote: > Hi, > > Does it support xml? > > -Original Message- > From: Tate Avery [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 02, 2003 11:45 PM > To: Lucene Users List > Subject: RE: SearchBlox J2EE Search Component Version 1.1 released > > > If you buy it, apparently: > http://www.searchblox.com/buy.html > > > > -Original Message- > From: Tun Lin [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 02, 2003 10:43 AM > To: 'Lucene Users List'; [EMAIL PROTECTED] > Subject: RE: SearchBlox J2EE Search Component Version 1.1 released > > > Hi, > > Just a feedback. > > SearchBlox can only search for html files. Will Searchblox support > pdf, xml and word documents in future? It will be perfect if it can > support all document types mentioned above. > > -Original Message- > From: Robert Selvaraj [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 02, 2003 10:42 PM > To: Lucene Users List; [EMAIL PROTECTED] > Subject: SearchBlox J2EE Search Component Version 1.1 released > > SearchBlox is a J2EE search component that enables you to add search > functionality to your applications, intranets or portals in a matter of minutes. > SearchBlox uses Lucene Search API and features integrated HTTP and > File System crawlers, support for different document formats, support > for indexing and searching content in 15 languages and customizable > search results, all controlled from a browser-based Admin Console. > > > Main features in this update: > = > - Asian language support. SearchBlox now supports Japanese, Chinese > Simplified, Chinese Traditional and Korean language content. > - Performance enhancements to search > - Improved Hit Highlighting > > SearchBlox is available as a Web Archive (WAR) and is deployable on > any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started > Guides are available for the following servers: > > JBoss - http://www.searchblox.com/gettingstarted_jboss.html > Jetty - http://www.searchblox.com/gettingstarted_jetty.html > JRun - http://www.searchblox.com/gettingstarted_jrun.html > Pramati - http://www.searchblox.com/gettingstarted_pramati.html > Resin - http://www.searchblox.com/gettingstarted_resin.html > Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html > Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html > Websphere - http://www.searchblox.com/gettingstarted_websphere.html > > > The SearchBlox FREE Edition is available free of charge and can index > up to 1000 HTML documents. > > The software can be downloaded from http://www.searchblox.com > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: SearchBlox J2EE Search Component Version 1.1 released
No. The formats supported by SearchBlox given here : http://www.searchblox.com/faqs/question.php?qstId=5 Tun Lin wrote: Hi, Does it support xml? -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 11:45 PM To: Lucene Users List Subject: RE: SearchBlox J2EE Search Component Version 1.1 released If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
I am seriously impressed with that - very smooth looking, and easy to use its a shame its quite pricey ... -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: 02 December 2003 15:45 To: Lucene Users List Subject: RE: SearchBlox J2EE Search Component Version 1.1 released If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] This e-mail and any attachments may be confidential and/or legally privileged. If you have received this e-mail and you are not a named addressee, please inform Landmark Information Group on 01392 441700 and then delete the e-mail from your system. If you are not a named addressee you must not use, disclose, distribute, copy, print or rely on this e-mail. This email and any attachments have been scanned for viruses and to the best of our knowledge are clean. To ensure regulatory compliance and for the protection of our clients and business, we may monitor and read e-mails sent to and from our servers. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
Hi, Does it support xml? -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 11:45 PM To: Lucene Users List Subject: RE: SearchBlox J2EE Search Component Version 1.1 released If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
Wow. Bravo. This is a fantasic search component. Thank you for providing this information. :-) Three cheers! -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
SearchBlox J2EE Search Component Version 1.1 released
SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]