Re:Mag Gam
http://teca4gso.teca4design.be/xtg/bvafrstv.extbqa Mag Gam 7/21/2013 7:28:20 AM - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Is lucene right for us
Hello All, At my university we have over 20,000 small file ranging from 20k to 500k per directory and we would like to index them. I was wondering if Lucene is the right tool for this? The information we would like to keep is: filename, filesize, filedate, filecontent. Also, is it possible to run the initial index in multithreaded mode since we are talking about many directories with similar contents? TIA - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Advantage of putting lucene index in RDBMS
I have been reading the lists for couple of week now, and I noticed people asking about placing their indexes into a RDBMS. What is the advantage of that? So far lucene was able to solve all my problems, but I am curious how else people are using it (especially with RDBMS). TIA
Re: Advantage of putting lucene index in RDBMS
I appreciate everyone's responses. I guess the main advantage of putting lucene's index into a RDBMS is for flexibility of queries. Personally, I rather use a RDBMS for results than lucene because I am more experienced with SQL queries than using Java. Does anyone have a simple example of using FileDocument ( http://lucene.apache.org/java/docs/api/org/apache/lucene/demo/FileDocument.html), which includes the following fields:path, modified, and contents? I would like to try this approach TIA! On 10/5/06, Aleksei Valikov <[EMAIL PROTECTED]> wrote: Hi. > As one of the people who asked about placing indeces into RDBMS, I was > primarily interested in just storing index in the RDBMS (basically, > storing the structures described on this page > http://lucene.apache.org/java/docs/fileformats.html in the relational > DB). The main reason is NOT to be able to perform some magic with > joining Lucene and 'pure DB query' results (which, actually, IS useful > in some curcumstances, but don't really see a problem of doing it in > Java after quering DB and Lucene), but rather avoid the cost of > reindexing and associated problems in complex enterprise environments. There no problem joining/intersecting Lucene/DB results in the Java layer apart from performance. Imagine you have 10k results from Lucene and 10k results from the RDB and you only need results 20...40 ordered by 'name' field, ascending (which is the usual case). An sql query with join and limit/offset would be much faster than joining 20k entries in Java. > Yet another advantage of storing index in the DB is its 'managability' > and 'debugabiliy' (nice word!). Through there is Luke, etc, > administrators in big companies do not want to learn many new tools and > having smth already familiar to deal with can sometimes be a good > argument in favor of product adoption. (BTW, Compass, as Aleksei > mentioned, can be the answer to this prayer - meant to check it out long > time ago, but haven't got around to it yet. Also, it seems like the > project is half-dead. I wonder if it's true...) Compass is a lively and active project, we successfully use it in production. Bye. /lexi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene Jdbc Directory
Hi folks, After some latenight google searching, I came across this interesting website, http://static.compassframework.org/docs/latest/jdbcdirectory.html Is this framework free? Is anyone using this? If so, how is it for newbies to Lucene/Derby/Java? I always wanted to have a lucene index in a RDBMS ... TIA
correct classpath
Hi All, I am getting into Java + Lucene. To compile a Lucene program CreateIndex.java public class CreateIndex { // usage: CreateIndex index-directory public static void main(String[] args) throws Exception { String indexPath = args[0]; IndexWriter writer; // An index is created by opening an IndexWriter with the // create argument set to true. writer = new IndexWriter(indexPath, null, true); writer.close(); } } What CLASSPATH should I set? I currently have this: /home/tomcat/lucene-2.0.0/lucene-core-2.0.0.jar thanks! ** * *
Tomcat Simple Example
Hi All, Does anyone have a simple Tomcat search/result example? I have 4 text files, i would like to index. Thanks
Re: Tomcat Simple Example
Thanks! So, when working with Tomcat, for a simple Index + Search, it is recommend to use JSP over servlets? any advice? On 8/23/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : > Does anyone have a simple Tomcat search/result example? : > I have 4 text files, : > i would like to index. take a look at the geting started guide, and the demo WAR that comes with the Lucene distribution... http://lucene.apache.org/java/docs/gettingstarted.html -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat Simple Example
Thanks for the response Erik! You make a good point. I have the 'Lucene in Action' book, and it has some good ways of doing things...its at work now (I am away for 3 weeks), thats the only bad thing :-( On 8/23/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Aug 23, 2006, at 5:18 PM, Mag Gam wrote: > So, when working with Tomcat, for a simple Index + Search, it is > recommend > to use JSP over servlets? > > any advice? No such recommendation would ever officially come from the Lucene community. Lucene is entirely independent of how search results get rendered. The demo application is nothing more than a demonstration, not a recommendation of technologies to use around Lucene. Whatever technologies best fit your environment is what I'd recommend :) I've used all types of technologies on top of Lucene, from a servlet, to JSP pages, Struts, Tapestry (what lucenebook.com uses) to now using Ruby on Rails backed by Solr (which fronts Lucene with servlets). Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Query parser.parse (line);
Hi All, I am trying to do a Query parse line in a doGET method (J2EE). I keep getting this type of message; unreported exception org.apache.lucene.queryParser.ParseException; must be caught or declared to be thrown Anyone have an example of a Class being thrown an exception? I would kindly appreciate it. tia
Re: Query parser.parse (line);
Hi Erik, thanks for the quick reply I am looking at this page http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html Any chance there is a new version for 2.0? Or are there any 2.0 examples (other than the stock example)? On 8/23/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Aug 23, 2006, at 7:24 PM, Mag Gam wrote: > I am trying to do a Query parse line in a doGET method (J2EE). > > I keep getting this type of message; > unreported exception org.apache.lucene.queryParser.ParseException; > must be > caught or declared to be thrown > > Anyone have an example of a Class being thrown an exception? > > I would kindly appreciate it. The simplest thing would be something like this in a servlet: try { Query query = parser.parse(...); } catch (ParseException pe) { throw new IOException(pe.getMessage()); } But you may not want the user to see that harsh of an error from your servlet when an unparsable expression is used. Typically that exception is caught and a friendlier message is provided to the user. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Query parser.parse (line);
Very good advice! With the previous code you gave me, I was able to get everything for 2.0! good call! On 8/23/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Aug 23, 2006, at 8:45 PM, Mag Gam wrote: > I am looking at this page > http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html > > Any chance there is a new version for 2.0? Or are there any 2.0 > examples > (other than the stock example)? No, that article has not been updated for the 2.0 API. The main things that changed from the version of Lucene used in that article are: * QueryParser must now be instantiated and the instance .parse() must be used instead of the previous QueryParser.parse() static method. * BooleanQuery.add() now uses the Occur enumeration rather than the boolean flags. You can reference the javadocs for the 2.0 QueryParser and BooleanQuery APIs here: <http://lucene.apache.org/java/docs/api/ index.html> and that will help you resolve any compilation issues you get when using code examples for previous versions. Given the nature of your questions I recommend you download Lucene 1.9 and use it for the time being (at least for the javadocs to see the deprecation messages with upgrade details), while you become familiar with Lucene. All the examples you'll find online and in Lucene in Action are 1.9 compatible, and the main change between 1.9 and 2.0 is that all the deprecated methods have been removed. Once you have a working system and understand Lucene's API in more detail, you can tidy up any deprecation warnings you get and upgrade to 2.0. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
NoClassDefFoundError
Hi All, I keep getting this error in my tomcatlogs Aug 24, 2006 7:44:09 AM org.apache.catalina.core.ApplicationContext log INFO: Marking servlet search as unavailable Aug 24, 2006 7:44:09 AM org.apache.catalina.core.StandardWrapperValve invoke SEVERE: Allocate exception for servlet search java.lang.NoClassDefFoundError: org/apache/lucene/queryParser/ParseException at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2357) at java.lang.Class.getConstructor0(Class.java:2671) at java.lang.Class.newInstance0(Class.java:321) at java.lang.Class.newInstance(Class.java:303) at org.apache.catalina.core.StandardWrapper.loadServlet( StandardWrapper.java:1048) at org.apache.catalina.core.StandardWrapper.allocate( StandardWrapper.java:750) at org.apache.catalina.core.StandardWrapperValve.invoke( StandardWrapperValve.java:130) at org.apache.catalina.core.StandardContextValve.invoke( StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke( StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke( ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke( StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service( CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process( Http11Processor.java:868) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection (Http11BaseProtocol.java: 663) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket( PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt( LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run( ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) My code works well, in Linux but on Solaris 9 it tanks, I keep getting this exception. Is there anything I can tweak? The code in question I am assuming is this: try { Query query = parser.parse (request.getParameter ("param1")); Hits hits = searcher.search (query); out.println (hits.length () + " results"); for (int i = 0; i < hits.length (); i++) { Document doc = hits.doc (i); out.println (doc.get ("path")); out.println (""); } } catch (ParseException pe) { throw new IOException (pe.getMessage ()); } Any thoughts? tia
Re: NoClassDefFoundError
Thankyou! You are right. Seems like tomcat overwrites my path. I had to manually move the .jar files into Tomcat's precence. On 8/24/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: My hunch is you don't have the Lucene JAR in the classpath at runtime. Erik On Aug 24, 2006, at 7:58 AM, Mag Gam wrote: > Hi All, > > I keep getting this error in my tomcatlogs > > > Aug 24, 2006 7:44:09 AM org.apache.catalina.core.ApplicationContext > log > INFO: Marking servlet search as unavailable > Aug 24, 2006 7:44:09 AM > org.apache.catalina.core.StandardWrapperValve invoke > SEVERE: Allocate exception for servlet search > java.lang.NoClassDefFoundError: org/apache/lucene/queryParser/ > ParseException >at java.lang.Class.getDeclaredConstructors0(Native Method) >at java.lang.Class.privateGetDeclaredConstructors(Class.java: > 2357) >at java.lang.Class.getConstructor0(Class.java:2671) >at java.lang.Class.newInstance0(Class.java:321) >at java.lang.Class.newInstance(Class.java:303) >at org.apache.catalina.core.StandardWrapper.loadServlet( > StandardWrapper.java:1048) >at org.apache.catalina.core.StandardWrapper.allocate( > StandardWrapper.java:750) >at org.apache.catalina.core.StandardWrapperValve.invoke( > StandardWrapperValve.java:130) >at org.apache.catalina.core.StandardContextValve.invoke( > StandardContextValve.java:178) >at org.apache.catalina.core.StandardHostValve.invoke( > StandardHostValve.java:126) >at org.apache.catalina.valves.ErrorReportValve.invoke( > ErrorReportValve.java:105) >at org.apache.catalina.core.StandardEngineValve.invoke( > StandardEngineValve.java:107) >at org.apache.catalina.connector.CoyoteAdapter.service( > CoyoteAdapter.java:148) >at org.apache.coyote.http11.Http11Processor.process( > Http11Processor.java:868) >at > > org.apache.coyote.http11.Http11BaseProtocol > $Http11ConnectionHandler.processConnection > (Http11BaseProtocol.java: > > 663) >at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket( > PoolTcpEndpoint.java:527) >at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt( > LeaderFollowerWorkerThread.java:80) >at org.apache.tomcat.util.threads.ThreadPool > $ControlRunnable.run( > ThreadPool.java:684) >at java.lang.Thread.run(Thread.java:595) > > > > > My code works well, in Linux but on Solaris 9 it tanks, I keep > getting this > exception. Is there anything I can tweak? The code in question I am > assuming > is this: > > try > > { > > Query query = parser.parse (request.getParameter ("param1")); > > Hits hits = searcher.search (query); > > out.println (hits.length () + " results"); > > for (int i = 0; i < hits.length (); i++) > > { > > Document doc = hits.doc (i); > > out.println (doc.get ("path")); > > out.println (""); > > } > > } > > catch (ParseException pe) > > { > > throw new IOException (pe.getMessage ()); > > } > > > > Any thoughts? > > tia - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Document Get question
Is it possible to get Document Name, instead of its entire path? Currently, i have something like this: out.println (doc.get ("path")); // Which gives me /documents/file.txt Is it possible to get "file.txt"
Index Stat Functions
Hi All, I am trying to get some stats on my Index such as: 1) When it was created 2) Size in MB of the index 3) If I can get the size, date of each file in the index. For example: I index 100 files, is it possible for me to get their name, size, and date when the last modification of that file (similar to a unix "ls -la /path/to/file) tia
Any 2.0 examples yet?
Hi All, While searching the net for 2.0 API examples, I noticed there aren't that many. The only example I have seen is the stock example. Are there any tutorials or example codes out there? Tia
Re: Any 2.0 examples yet?
Thanks for the replies. I should of waited a little bit longer for the Genie book (LIA) :-) On 8/27/06, Michael McCandless <[EMAIL PROTECTED]> wrote: Erik Hatcher wrote: > Also let me also emphasize the test cases that are built into the Lucene > codebase itself. These are premium *always working* examples of how to > use specific parts of Lucene in an isolated fashion. Check out > Lucene's trunk (or 2.0 branch) via Subversion and enjoy. Here here!! I found the examples in LIA (all junit test cases) excellent and in general units tests are an awesome way to learn the APIs. Also, they only grow/improve (and track API changes) with time so they'll continue to be a great way to learn the APIs. > p.s. I had updated all of the "Lucene in Action" code to be 2.0 > compliant a while back but have never published it. I'll make a point > of doing that as soon as possible and posting its location here. This would be fantastic. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Sort by Date
Is it possible to sort results by date of the document?
Re: Sort by Date
"Index the date". Do you mean, index date, or the document date? Could this be in a LIA book? On 8/29/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Aug 29, 2006, at 11:50 AM, Mag Gam wrote: > Is it possible to sort results by date of the document? Sure, check out the Sort class and the overloaded IndexSearcher.search () methods that take a Sort. You will need to index the date in a sortable way. DateTools provides handy methods for this purpose. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Highligher Example
Hey Anyone have a search result highlighter example? I have various doc, PDFs, DOC, TXT, PPT, and I would like to show a highlight, similar to how google does it... tia
Re: Highligher Example
Thanks for the quick response Erik. I will be getting my LIA book back very soon, I forgot it at a destination :-( Lets assume, there is a document called "hello.pdf" and it has the content "this is hello.pdf. It uses Acrobat" When I perform a search for "Acrobat", i want hello.pdf to show up, and also the 'It uses Acrobat' something like that. tia On 9/7/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: There are test cases in the Highlighter codebase that exercise it and show its use, as well as a few examples of it in the "Lucene in Action" codebase. These examples output plain text with some prefix and suffix surrounding the highlighted terms. Highlighting text in a PDF is possible, I'm pretty sure, but I don't think the same would be easily possible with Microsoft document formats. I'm not sure if you are asking for these document types to be highlighted or just a plain text representation of them, though. Erik On Sep 7, 2006, at 6:37 PM, Mag Gam wrote: > Hey > > Anyone have a search result highlighter example? > > I have various doc, PDFs, DOC, TXT, PPT, and I would like to show a > highlight, similar to how google does it... > > tia - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
getCurrentVersion question
Hi All, I am trying to get the exact date when my index was created. I am assuming getCurrentVersion() is the right way of doing it. However, I am getting a result something like this: 1157817833085 According to the API reference, "Reads version number from segments files. The version number is initialized with a timestamp and then increased by one for each change of the index." So, to get the date of this, I should be doing something like this: date=1157817833085-1; Any thoughts? tia
Re: getCurrentVersion question
Tom: great! Now do you do you add metadata? I am new to Lucene API + Java, but willing to learn. Got an example? TIA On 9/12/06, Tom Emerson <[EMAIL PROTECTED]> wrote: As far as I know there isn't a way to do this. What we do is add a "metadata" document to each index that includes the creation date, the user name of the creating user, and various other tidbits. This gets updated on incremental updates to the index as well. Easily done and makes it easy to query. On 9/9/06, Mag Gam <[EMAIL PROTECTED]> wrote: > > Hi All, > > I am trying to get the exact date when my index was created. I am assuming > getCurrentVersion() is the right way of doing it. However, I am getting a > result something like this: 1157817833085 > > According to the API reference, > "Reads version number from segments files. The version number is > initialized > with a timestamp and then increased by one for each change of the index." > > So, to get the date of this, I should be doing something like this: > date=1157817833085-1; > > Any thoughts? > tia > > -- Tom Emerson [EMAIL PROTECTED] http://www.dreamersrealm.net/~tree
Example question
While looking at the example's Index and Search code, I have noticed in the search, there is a : out.println (doc.get ("path")); I am not sure how is "path" is getting into the index. If you take a look at the Index code, there is no mention of "path". My question are: what is this path (I know it prints out the filesystem path)? Is this a reserved word, if so, where can I get a list of reserved words? How can I list all hashes like "path" ? TIA
Re: Example question
No, I am talking about the Lucene Examples (not from LIA). On 9/16/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: Are you talking about the LIA example code? The sample data gets indexed using the Ant task, which automatically adds a "path" field for every document. You'll see the document handler mentioned in the LIA build.xml file as well as the code for it in the code download. Erik On Sep 16, 2006, at 12:32 PM, Mag Gam wrote: > While looking at the example's Index and Search code, I have > noticed in the > search, there is a : > > out.println (doc.get ("path")); > > I am not sure how is "path" is getting into the index. If you take > a look at > the Index code, there is no mention of "path". My question are: > what is this > path (I know it prints out the filesystem path)? Is this a reserved > word, if > so, where can I get a list of reserved words? How can I list all > hashes like > "path" ? > > TIA - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Example question
Sorry for the confusion all. The code i am talking about is, the lucene-2.0 API Document doc = hits.doc(i); String path = doc.get("path"); lucene-2.0.0/src/demo/org/apache/lucene/demo/SearchFiles.java (line 147) I am not sure where they are getting the "path". How are they inserting it into the index? Basically, I am trying to index the contents of the files, and the filesystem written date. Therefore, I started to create an index, but I am having no luck in either. I have simply been using the example indexing :-( TIA On 9/16/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : No, I am talking about the Lucene Examples (not from LIA). you're going to need to be more specific about what you mean ... what is the exact location of the file where you are seeing the "Example" in question? ... is it a URL? is it a file from a release you downloaded? what is the URL of the release? If i had to guess, i'd assume you are talking about either this file... http://svn.apache.org/viewvc/lucene/java/trunk/src/demo/org/apache/lucene/demo/SearchFiles.java or one of these files... http://svn.apache.org/viewvc/lucene/java/trunk/src/jsp/ ...but i can't be sure because none of those file have the exact string you mentioned... : > > out.println (doc.get ("path")); : > > : > > I am not sure how is "path" is getting into the index. If you take ... they do refrence a field named "path", so it's my best guess, except that... : > > the Index code, there is no mention of "path". My question are: : > > what is this : > > path (I know it prints out the filesystem path)? Is this a reserved ...both HTMLDocument and FieldDocument (the classes used to build the indexes for the demo code) do in fact crete fields named "apth" hence i am stumped as to what exactly you are looking at. : > > so, where can I get a list of reserved words? How can I list all : > > hashes like : > > "path" ? there are no reserved field names, but you can get a list of all the fields in a document using Document.getFields() -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Using example Lucene 2.0 index class
Hi All, I have been using the Lucene 2.0 distro Index to index my files, currently it indexes filepath and contents. I want to index, lastModified() (Returns the time that the file denoted by this abstract pathname was last modified.), and file length, length(). Can someone please show me how to do that? I am not too strong with Java, some example code would be nice! TIA
Derby + Lucene
Anyone here have any luck with integration of Apache Derby and Lucene? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Integrate Lucene with Derby
Are there any documens or plans to integrate Lucene With Apache Derby (database)? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Integrate Lucene with Derby
yes. I have been looking for solutions for a while now. I am not too good with Java but I am learning it... I have asked the kind people of Derby-users, and they say there is no solution for this yet. I guess we can ask the people on the -developer list On 8/13/05, jian chen <[EMAIL PROTECTED]> wrote: > Hi, > > I am also interested in that. I haven't used Derby before, but it > seems the java database of choice as it is open source and a full > relational database. > > I plant to learn the simple usage of Derby and then think about > integrating Derby with Lucene. > > May we should post our progress for the integration and various > schemes of integration in this thread or somewhere else? > > Thanks, > > Jian > > On 8/13/05, Mag Gam <[EMAIL PROTECTED]> wrote: > > Are there any documens or plans to integrate Lucene With Apache Derby > > (database)? > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Integrate Lucene with Derby
thanks for looking into it. Yes, I do agree DERBY is very easy to use and get started. I am looking for a solution more on the line of tsearch2 (http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A//www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/&ei=zcUBQ9aGCYmm-gGPr9DcAg) I guess all we need is a data type of vector for example: SELECT 'Our first string used today'::tsvector; tsvector --- 'Our' 'used' 'first' 'today' 'string' (1 row) On 8/14/05, jian chen <[EMAIL PROTECTED]> wrote: > I just downloaded a copy of the derby binary and successfully run the > simple example java program. It seems derby is extremely easy to use > as an embeded java database engine. > > This gave me some confidence that I could integrate Lucene with Derby > and possibly Jetty server, to make a complete java based solution for > a hobby search project. > > I will post more regarding this integration as I go along. > > Cheers, > > Jian > www.jhsystems.net > > On 8/13/05, Mag Gam <[EMAIL PROTECTED]> wrote: > > yes. I have been looking for solutions for a while now. I am not too > > good with Java but I am learning it... > > > > I have asked the kind people of Derby-users, and they say there is no > > solution for this yet. > > > > I guess we can ask the people on the -developer list > > > > > > On 8/13/05, jian chen <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > I am also interested in that. I haven't used Derby before, but it > > > seems the java database of choice as it is open source and a full > > > relational database. > > > > > > I plant to learn the simple usage of Derby and then think about > > > integrating Derby with Lucene. > > > > > > May we should post our progress for the integration and various > > > schemes of integration in this thread or somewhere else? > > > > > > Thanks, > > > > > > Jian > > > > > > On 8/13/05, Mag Gam <[EMAIL PROTECTED]> wrote: > > > > Are there any documens or plans to integrate Lucene With Apache Derby > > > > (database)? > > > > > > > > - > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > - > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene and Derby
Has anyone been able to integrate Lucene with Apache Derby? I am in need of Full text indexing for my database.
Re: Is Lucene for Me?
How did you integrate Lucene into MS-SQL server?Are there any plans to integrate Lucene into databases like Apache Derby? Has anyone been able to integrate them together? Are there any docs we should look at to get this done? On 9/14/05, Peter Veentjer - Anchor Men <[EMAIL PROTECTED]> wrote: > > We have replaced the MS SQL-server textsearch functionality > by Lucene, and the responses are a lot quicker now. > > (we have 8.000.000 records). > > -Oorspronkelijk bericht- > Van: Erik Hatcher [mailto:[EMAIL PROTECTED] > Verzonden: woensdag 14 september 2005 2:33 > Aan: java-user@lucene.apache.org > Onderwerp: Re: Is Lucene for Me? > > > On Sep 13, 2005, at 8:27 PM, James Reynolds wrote: > > Please forgive this low tech question, but I'm wondering if Lucene is > > an appropriate solution for a challenge I'm facing. I need a quick > > look up method for a growing list of customers in a database (the > > alphabetical select list has become too cumbersome). > > > Lucene seems to be an excellent option for a key word search, but I > > wonder if it's overkill for my relatively simple need. Have other > > users leveraged Lucene in this manner? > > Certainly Lucene could do this quite easily, but since you're already > using a database it would be worth it to explore whether LIKE queries or > full-text capabilities of your database will achieve what you're after > without adding another dependency and related code to your project. > > It wouldn't be overkill to use Lucene for this scenario at all if you > can't achieve what you're after within your database as-is. > > Erik > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: Lucene database bindings
Mark: Thanks for looking at this.I will try it out! On 9/16/05, markharw00d <[EMAIL PROTECTED]> wrote: > > I know there have been some posts discussing how to integrate Lucene > with Derby recently. > > I've added an example project that works with both HSQLDB and Derby > here: http://issues.apache.org/jira/browse/LUCENE-434 > > The bindings allow you to use SQL that mixes database and Lucene > functionality in ways like this: > > select top 10 lucene_score(id) as SCORE, > lucene_highlight(adText) from ads > where pricePounds <200 and pricePounds >1 > and lucene_query('"drum kit"',id)>0 > order by SCORE DESC, pricePounds ASC > > See the readme.txt in the zip file for details. > > Cheers, > Mark > > > > > > > > > ___ > To help you stay safe and secure online, we've developed the all new > Yahoo! Security Centre. http://uk.security.yahoo.com > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: Lucene database bindings
Does your example store the index in the derby db or somewhere else? I was thinking of indexing a table in a seperate column. On 9/16/05, markharw00d <[EMAIL PROTECTED]> wrote: > > I know there have been some posts discussing how to integrate Lucene > with Derby recently. > > I've added an example project that works with both HSQLDB and Derby > here: http://issues.apache.org/jira/browse/LUCENE-434 > > The bindings allow you to use SQL that mixes database and Lucene > functionality in ways like this: > > select top 10 lucene_score(id) as SCORE, > lucene_highlight(adText) from ads > where pricePounds <200 and pricePounds >1 > and lucene_query('"drum kit"',id)>0 > order by SCORE DESC, pricePounds ASC > > See the readme.txt in the zip file for details. > > Cheers, > Mark > > > > > > > > > ___ > To help you stay safe and secure online, we've developed the all new > Yahoo! Security Centre. http://uk.security.yahoo.com > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: Lucene database bindings
Mark: VERY VERY good post! Please publish this doc and example. On 9/17/05, Chris Lu <[EMAIL PROTECTED]> wrote: > > On 9/17/05, markharw00d <[EMAIL PROTECTED]> wrote: > > Mag Gam wrote: > > > > >Does your example store the index in the derby db or somewhere else? I > was > > >thinking of indexing a table in a seperate column. > > > > > > > > The software is not an org.apache.lucene.store.Directory implementation > > ie an FSDirectory alternative for persisting Lucene data in a relational > > table. > > Instead, the software demonstrates a way to extend SQL syntax to allow > > Lucene queries to run as in-line functions during the database's > > execution of queries. These hybrid SQL statements can take advantage of > > the usual databases functions for sorting, grouping joins, conditions, > > indexes etc but also use Lucene queries and highlighting functions all > > in the one SQL statement. > > The Lucene indexes used as part of this can be any standard Directory > > implementation (eg RAM, FS). > > > > The motivation for creating a Lucene/RDBMS hybrid query tool was to > > address issues commonly associated with using just Lucene: > > 1) Sorting on float/date fields and associated memory consumption > > 2) Representing numbers/dates in Lucene (eg having to pad with sufficent > > leading zeros and add > > to index's list of terms) > > 3) Retrieving only certain stored fields from a document (all storage > > can be done in db) > > 4) Issues to do with updating *volatile* data eg price data used in > sorts > > 5) Manually coding joins with RDBMS content as custom filters > > 6) Too-many terms exceptions produced by range queries > > 7) Grouping results eg by website > > 8) Boosting docs based on stored content eg date > > > > These are the sorts of things an RDBMS can help with. > > > > Cheers > > Mark > > > > Mark, > > This is really good stuff! > I have been thinking about it for a long while. > Thank you for showing us the door! > > Basically your lucene_query function will return a true/false in one > of the query predicates for each record. > This will be very useful when other query predicates can filter out a > lot of records. > > Is there any hint to give DB to use the lucene_query function last? > > Chris Lu > > Lucene RAD on Any Databases > http://www.dbsight.net > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: May I use a mixture of indexing methods altogether?
Is it possible to do that in a database instead of a flat text file? On 9/24/05, Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > Thank you. That was what I meant! > I'll try it as soon as possible. > > Otis Gospodnetic wrote: > > >If I understand you correctly, then yes, you can index documents with > >different structures (different field names) in the same index. > > > >Otis > > > >--- Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > > >>Hi, > >>I have a project seed in my mind. > >>I will try to collect everything which have a possibility to be > >>remembered by me some day, and I will index them with Lucene. > >>Instead of using simple keywords, I will try to index whole documents > >> > >>wherever possible. So, I can start searching with a simple word, and > >>then continue to ask for more detailed answers. > >>For example, frequently I forget the name of my acquintances. But I > >>don't forget their gender. Or, I can remember some other details of > >>them. So, I should index every possible clues. > >>But, at the other end, I have some other documents -such as some > >>passwords, etc- which must be encrypted. Then, I must use some > >>keywords > >>for them. > >>Is it possible to use a mixture of indexing methods for the same > >>clusture of documents? Some of them will be very detailed, and some > >>of > >>them will contain only a simple keyword. > >>Is it possible? > >>Thanks. > >>Ahmet Aksoy > >> > >> > >> > >> > >> > > > > > > > > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Can Lucene do this?
I have a document called "foo.txt" and it has a LOT of information on, various computer tips, programming code, phone numbers, addresses, etc..etc... The document is set up something like this foo.txt- Phone number for Pizza: 1800-999- To cook rice: Put rice on the cooker, and just cook it To see what processes are running: ps -ef and it keeps on going on... foo.txt- Can I use lucene to index this document only, and display blocks of information if I need it? Like if I type in this query, "cook rice" I should get the "To Cook rice:" statement and block... TIA!
Re: Can Lucene do this?
Maik: Thanks for the reply. I was going to go that way, but it involves a lot of work, since my text file is about 3 meg of information. However, I am looking into integrating my data with Derby plus Lucene. TIA! On 9/26/05, Maik Schreiber <[EMAIL PROTECTED]> wrote: > > > Can I use lucene to index this document only, and display blocks of > > information if I need it? Like if I type in this query, "cook rice" I > should > > get the "To Cook rice:" statement and block... > > Just break your original document into multiple documents by splitting on > empty lines, and add those documents with a "text" field into a Lucene > index. > > -- > Maik Schreiber * http://www.blizzy.de > > GPG public key: > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713 > Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713 > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: May I use a mixture of indexing methods altogether?
Otis: How do you do that? Got a quick and simple example? We have been looking for an example for the last 3-4 months, but no luck On 9/25/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > > > Is it possible to do that in a database instead of a flat text file? > > Huh? > You mean is it possible to index a database and not text files in this > fashion? If so: yes. > > Otis > > > > On 9/24/05, Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > Thank you. That was what I meant! > > > I'll try it as soon as possible. > > > > > > Otis Gospodnetic wrote: > > > > > > >If I understand you correctly, then yes, you can index documents > > with > > > >different structures (different field names) in the same index. > > > > > > > >Otis > > > > > > > >--- Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > > >>Hi, > > > >>I have a project seed in my mind. > > > >>I will try to collect everything which have a possibility to be > > > >>remembered by me some day, and I will index them with Lucene. > > > >>Instead of using simple keywords, I will try to index whole > > documents > > > >> > > > >>wherever possible. So, I can start searching with a simple word, > > and > > > >>then continue to ask for more detailed answers. > > > >>For example, frequently I forget the name of my acquintances. But > > I > > > >>don't forget their gender. Or, I can remember some other details > > of > > > >>them. So, I should index every possible clues. > > > >>But, at the other end, I have some other documents -such as some > > > >>passwords, etc- which must be encrypted. Then, I must use some > > > >>keywords > > > >>for them. > > > >>Is it possible to use a mixture of indexing methods for the same > > > >>clusture of documents? Some of them will be very detailed, and > > some > > > >>of > > > >>them will contain only a simple keyword. > > > >>Is it possible? > > > >>Thanks. > > > >>Ahmet Aksoy > > > >> > > > >> > > > >> > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: May I use a mixture of indexing methods altogether?
Otis: Thanks for the good and clean explanation! I will first try this out, and let you know how that goes...what you are saying is making VERY good sense! Once I index them, will this goto the filesystem, or somewhere else? I want this index to be created in the table, so I can do quick SELECTs thru there. TIA! On 9/26/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > > It's easy, pull the data from DB using something like JDBC, and from > retrieved rows create Lucene Documents. Of course, it gets more > complicated than this, but start with something simple like using JDBC > to run SELECTs, converting results to Lucene Documents, and index them > with IndexWriter. > > There are also tools like Compass and DBSight that may help. > > Otis > > > --- Mag Gam <[EMAIL PROTECTED]> wrote: > > > Otis: > > > > How do you do that? Got a quick and simple example? We have been > > looking for > > an example for the last 3-4 months, but no luck > > > > > > > > > > On 9/25/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > > > > > > > Is it possible to do that in a database instead of a flat text > > file? > > > > > > Huh? > > > You mean is it possible to index a database and not text files in > > this > > > fashion? If so: yes. > > > > > > Otis > > > > > > > > > > On 9/24/05, Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > > > > > Thank you. That was what I meant! > > > > > I'll try it as soon as possible. > > > > > > > > > > Otis Gospodnetic wrote: > > > > > > > > > > >If I understand you correctly, then yes, you can index > > documents > > > > with > > > > > >different structures (different field names) in the same > > index. > > > > > > > > > > > >Otis > > > > > > > > > > > >--- Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > > > > > > > > > > >>Hi, > > > > > >>I have a project seed in my mind. > > > > > >>I will try to collect everything which have a possibility to > > be > > > > > >>remembered by me some day, and I will index them with Lucene. > > > > > >>Instead of using simple keywords, I will try to index whole > > > > documents > > > > > >> > > > > > >>wherever possible. So, I can start searching with a simple > > word, > > > > and > > > > > >>then continue to ask for more detailed answers. > > > > > >>For example, frequently I forget the name of my acquintances. > > But > > > > I > > > > > >>don't forget their gender. Or, I can remember some other > > details > > > > of > > > > > >>them. So, I should index every possible clues. > > > > > >>But, at the other end, I have some other documents -such as > > some > > > > > >>passwords, etc- which must be encrypted. Then, I must use > > some > > > > > >>keywords > > > > > >>for them. > > > > > >>Is it possible to use a mixture of indexing methods for the > > same > > > > > >>clusture of documents? Some of them will be very detailed, > > and > > > > some > > > > > >>of > > > > > >>them will contain only a simple keyword. > > > > > >>Is it possible? > > > > > >>Thanks. > > > > > >>Ahmet Aksoy > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > - > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: May I use a mixture of indexing methods altogether?
well, it seems I want to store the index into the database itself. ANy ideas for that? Even if thats possible? On 9/26/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > > Lucene indices are created in the file system (FSDirectory) or in > memory (RAMDirectory). If you want to store them elsewhere, you need > to implement your own Directory. > > Otis > > --- Mag Gam <[EMAIL PROTECTED]> wrote: > > > Otis: > > > > Thanks for the good and clean explanation! I will first try this out, > > and > > let you know how that goes...what you are saying is making VERY good > > sense! > > Once I index them, will this goto the filesystem, or somewhere else? > > I want > > this index to be created in the table, so I can do quick SELECTs thru > > there. > > > > TIA! > > > > > > On 9/26/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > > > > > > It's easy, pull the data from DB using something like JDBC, and > > from > > > retrieved rows create Lucene Documents. Of course, it gets more > > > complicated than this, but start with something simple like using > > JDBC > > > to run SELECTs, converting results to Lucene Documents, and index > > them > > > with IndexWriter. > > > > > > There are also tools like Compass and DBSight that may help. > > > > > > Otis > > > > > > > > > --- Mag Gam <[EMAIL PROTECTED]> wrote: > > > > > > > Otis: > > > > > > > > How do you do that? Got a quick and simple example? We have been > > > > looking for > > > > an example for the last 3-4 months, but no luck > > > > > > > > > > > > > > > > > > > > On 9/25/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > Is it possible to do that in a database instead of a flat > > text > > > > file? > > > > > > > > > > Huh? > > > > > You mean is it possible to index a database and not text files > > in > > > > this > > > > > fashion? If so: yes. > > > > > > > > > > Otis > > > > > > > > > > > > > > > > On 9/24/05, Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > Thank you. That was what I meant! > > > > > > > I'll try it as soon as possible. > > > > > > > > > > > > > > Otis Gospodnetic wrote: > > > > > > > > > > > > > > >If I understand you correctly, then yes, you can index > > > > documents > > > > > > with > > > > > > > >different structures (different field names) in the same > > > > index. > > > > > > > > > > > > > > > >Otis > > > > > > > > > > > > > > > >--- Ahmet Aksoy <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >>Hi, > > > > > > > >>I have a project seed in my mind. > > > > > > > >>I will try to collect everything which have a possibility > > to > > > > be > > > > > > > >>remembered by me some day, and I will index them with > > Lucene. > > > > > > > >>Instead of using simple keywords, I will try to index > > whole > > > > > > documents > > > > > > > >> > > > > > > > >>wherever possible. So, I can start searching with a > > simple > > > > word, > > > > > > and > > > > > > > >>then continue to ask for more detailed answers. > > > > > > > >>For example, frequently I forget the name of my > > acquintances. > > > > But > > > > > > I > > > > > > > >>don't forget their gender. Or, I can remember some other > > > > details > > > > > > of > > > > > > > >>them. So, I should index every possible clues. > > > > > > > >>But, at the other end, I have some other documents -such > > as > > > > some > > > > > > > >>passwords, etc- which must be encrypted. Then, I must use > > > > some > > > > > > > >>keywords > > > > > > > >>for them. > > > > > > > >>Is it possible to use a mixture of indexing methods for > > the > > > > same > > > > > > > >>clusture of documents? Some of them will be very > > detailed, > > > > and > > > > > > some > > > > > > > >>of > > > > > > > >>them will contain only a simple keyword. > > > > > > > >>Is it possible? > > > > > > > >>Thanks. > > > > > > > >>Ahmet Aksoy > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > > > > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > > > > > > For additional commands, e-mail: > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > - > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
Re: Lucene vs SQL database
Check this link outI am trying to do the same http://marc.theaimsgroup.com/?l=lucene-user&m=100556272928584&w=2 I am using Apache Derby and trying to integrate that with lucene Its tough to find a very very simple example for this online. goodluck On 9/29/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On Sep 29, 2005, at 8:46 AM, Eugeny N Dzhurinsky wrote: > > On Thu, Sep 29, 2005 at 08:39:53AM -0400, George Abraham wrote: > > > >> Eugene, > >> You could grab all the fields for a record in a SQL database, mash > >> it all > >> together and transfer it into one indexing field in Lucene. Use some > >> scripting tools (or even JDBC and Java) to do this. However if you > >> are > >> asking if Lucene can go and look over a SQL database and return > >> results, > >> that would not work. Lucene has to index the database fields > >> first. The > >> indexing would happen with the first two sentences of my post. > >> > > > > Integersting. We have some kind of set of privileges, required to > > access the > > object (let's say rows in table(s)), I thought if it is possible to > > use > > kind of "injection" of access control statement to SQL query for > > extraction > > of only allowed data... But if Lucene needs to index anything, how > > could I > > define the access privileges for data? > > There are many options available. One such technique I described in > "Lucene in Action" ... a SecurityFilter. This simple example scheme > assumes each document has an "owner" and only owners are allowed to > see their documents and no others. By applying a SecurityFilter on a > search, the results are constrained appropriately. This scheme is > intentionally simplistic to show the possibilities. More commonly > would be a situation with users and groups that need to be > dynamically configurable - a Filter could still do this sort of > thing, but how documents are associated with groups would need to be > thoroughly conceived. > > Erik > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >