Re: solr 2.0 branch/sandbox?
On Thu, 2008-10-02 at 23:13 -0400, Ryan McKinley wrote: Hey- Rather then continually point to solr 2.0 as a future future thing, i'd like to give a go at removing all configs and deprecated stuff. -- I doubt that would end up being the real direction, but as an exercise would be quite valuable to figure out what the major issues will be and see how it feels. What do you think the best way to do this is? Today while preparing a talk with Santiago Gala about the ASF for [1], I stumbled over the following link in one of his slides. http://incubator.apache.org/learn/rules-for-revolutionaries.html Then I remembered this thread. salu2 [1] http://www.opensourceworldconference.com/ How do you feel if I make a branch to experiment with stripping all configs out of solr perhaps: http://svn.apache.org/repos/asf/lucene/solr/branches/sandbox/ or http://svn.apache.org/repos/asf/lucene/solr/branches/sandbox/ryan/ thoughts? ryan -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: Forrest PDF non-Latin-1 support [was: RE: prototype Solr 1.3 RC 1]
On Fri, 2008-08-29 at 17:57 -0400, Steven A Rowe wrote: On 08/29/2008 at 3:24 PM, Chris Hostetter wrote: I suspect the PDF formatter just doesn't play nicely with the non-trivial UTF-8 characters. ... There's an open Forrest bug for this problem: https://issues.apache.org/jira/browse/FOR-132, and the discussion there includes a link to the Cocoon documentation for embedding fonts in PDF files: http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html#FOP+and+Embedding+Fonts. This looks kinda complicated, and AFAICT would require modifications to the Forrest installation wherever the site is built. I just saw the thread, I will have a look. Which version of forrest is currently recommended? I ask because they have been done (and still some underway) to the pdf plugin lately. Will let you know about my findings. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: Hibernate Search and Solr analysis package
On Tue, 2008-03-11 at 03:04 -0400, Emmanuel Bernard wrote: ... Do you guys have any concern with such an approach? I am not only thinking technically (your lights are more than welcome), but more broadly with the concept of code borrowing. You know solr is ASF license, you can borrow as much as you want, just go nuts with it. ;) salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: concurrency while indexing
On Fri, 2008-02-08 at 08:44 -0500, Yonik Seeley wrote: On Feb 8, 2008 3:53 AM, Thorsten Scherler [EMAIL PROTECTED] wrote: I have following usecase, one solr instance which receives add/commit calls constantly from 3 different clients. The machine: Model: HP Proliant DL 360 Memory: 2 Gb CPU: 1 Intel Xeon 3.02 Ghz Disk: 2 x 36 GB SCSI en RAID I need to raise the number of clients to about 10, can this be a problem for the indexing machine? I'd stop the clients from doing commit themselves unless it's really necessary, and use some form of time based autocommit (see example solrconfig.xml). Cheers Yonlik, will have a look. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
concurrency while indexing
Hi all, I have following usecase, one solr instance which receives add/commit calls constantly from 3 different clients. The machine: Model: HP Proliant DL 360 Memory: 2 Gb CPU: 1 Intel Xeon 3.02 Ghz Disk: 2 x 36 GB SCSI en RAID I need to raise the number of clients to about 10, can this be a problem for the indexing machine? salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: Solr CWIKI ready for experimenting
On Sun, 2007-12-23 at 00:19 -0800, Chris Hostetter wrote: ... If you are a Solr committer and/or have a CLA on file with the ASF and want ot help with Solr documentation, please reply to this thread when you make an account (with the account name please), and i'll add you to the appropriate groups. If possible to have an account with write access I would like one (never know when I will write some lines of docu). ;) Login: thorsten salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: Commons-csv-0.1-SNAPSHOT
On Mon, 2007-09-24 at 16:23 -0400, Yonik Seeley wrote: On 9/24/07, George Aroush [EMAIL PROTECTED] wrote: For legal paperwork, I have to do a code-scan on all packages that come with Solr 1.2. The one I'm having hard time with is commons-csv-0.1-SNAPSHOT. As you know, Solr ships commons-csv-0.1-SNAPSHOT.jar -- can someone tell me exactly where is the source for this package? comons has moved to TLP from jakarta, so the source is at http://svn.apache.org/viewvc/commons/sandbox/csv/trunk/ The revision from which this snapshot was built is 524170 Sorry, I probably should have included the revision in the commit message. When I did the patch for SOLR-363 I as well wondered which revision it is. IMO the better way is to include the revision in the jar name like: commons-csv-r524170.jar salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: svn commit: r577427 - in /lucene/solr/trunk/client/java/solrj/test/org/apache/solr/client/solrj: LargeVolumeTestBase.java embedded/LargeVolumeEmbeddedTest.java embedded/LargeVolumeJettyTest.java
On Thu, 2007-09-20 at 11:11 -0700, Chris Hostetter wrote: : You can put it in the lib home and ant will find junit. I have it in my : project. : I can submit a patch tomorrow if you want. I'm not sure which lib home you are talking about, but I'm certainly open to a patch that allows us to bundle junit we want so new developers don't have to get it themselves ... i tried doing this not too long ago when i read that taskdefs could speficy classpaths for finding the task -- it worked great forsome things (like PMD) but i couldn't get it to work with ant to save my life. done https://issues.apache.org/jira/browse/SOLR-362 salu2 : I am importing solr build scripts in my project and build it from my : project without problem because the junit.jar is in my classpath. As : soon I want to build solr directly I can't because the fail ... junit : check. (Note: even if we can't get a patch working that does this, this specific problem is easy to deal with: even if junit isn't in your ANT_LIB, you can always use ant's -lib option to do this too.) -Hoss -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
[jira] Updated: (SOLR-362) bundle junit with solr
[ https://issues.apache.org/jira/browse/SOLR-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-362: --- Attachment: junit.include.diff junit-4.3.jar Patch to bundle junit with solr. bundle junit with solr -- Key: SOLR-362 URL: https://issues.apache.org/jira/browse/SOLR-362 Project: Solr Issue Type: Improvement Reporter: Thorsten Scherler Attachments: junit-4.3.jar, junit.include.diff http://marc.info/?t=11902336334r=1w=2 ant -version Apache Ant version 1.7.0 compiled on December 13 2006 Trivial modification to the build.xml to include junit with solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: svn commit: r577427 - in /lucene/solr/trunk/client/java/solrj/test/org/apache/solr/client/solrj: LargeVolumeTestBase.java embedded/LargeVolumeEmbeddedTest.java embedded/LargeVolumeJettyTest.java
On Thu, 2007-09-20 at 13:51 -0400, Erik Hatcher wrote: ... I'm happy to see contributions both for Ivy and Maven2 based builds of Solr. I'm sure we can do it in a non-intrusive way to the current Ant build so that folks can try it out. With a patch and a wiki page with instructions that'd be good enough to get the ball rolling. done https://issues.apache.org/jira/browse/SOLR-363 salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
[jira] Updated: (SOLR-363) Use ivy for dependency resolving
[ https://issues.apache.org/jira/browse/SOLR-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-363: --- Attachment: ivy.support.diff ivy.tar Use ivy for dependency resolving Key: SOLR-363 URL: https://issues.apache.org/jira/browse/SOLR-363 Project: Solr Issue Type: Improvement Reporter: Thorsten Scherler Attachments: ivy.support.diff, ivy.tar First cut on ivy support. Removed all libs in lib/ but not yet for the web app. Patch includes basic ivy rep (tar) for all jars that are ATM not in ivy nor in maven. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: svn commit: r577427 - in /lucene/solr/trunk/client/java/solrj/test/org/apache/solr/client/solrj: LargeVolumeTestBase.java embedded/LargeVolumeEmbeddedTest.java embedded/LargeVolumeJettyTest.java
On Wed, 2007-09-19 at 16:33 -0400, Ryan McKinley wrote: Yonik Seeley wrote: ant test from the command line is currently failing. compileTests: [mkdir] Created dir: F:\code\solr\build\tests [javac] Compiling 66 source files to F:\code\solr\build\tests [javac] F:\code\solr\client\java\solrj\test\org\apache\solr\client\solrj\Lar geVolumeTestBase.java:27: package org.junit does not exist [javac] import org.junit.Assert; -Yonik hymm, must be a different version of JUnit in my classpath. I just committed something that does not use the JUnit static references, hopefully that will work for you. Perhaps this is an argument for including JUnit or using ivy? http://www.nabble.com/Using-ivy-for-dependency-management--tf4396476.html#a12536854 Like I said if you want I can provide the ivy config files for solr, but as I understand the answer from Erik to this proposal that maven2 is more likely so I did not asked further. IMHO ivy rocks and would perfectly overcome this problem without cluttering the build files with jar versions. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: svn commit: r577427 - in /lucene/solr/trunk/client/java/solrj/test/org/apache/solr/client/solrj: LargeVolumeTestBase.java embedded/LargeVolumeEmbeddedTest.java embedded/LargeVolumeJettyTest.java
On Thu, 2007-09-20 at 11:24 -0400, Erik Hatcher wrote: On Sep 19, 2007, at 5:52 PM, Chris Hostetter wrote: Everybody i've ever talked to who i felt confident knew more about ant then me (with Erik at teh top of the list) has said the same thing: Put junit and ant-junit in your ANT_LIB ... don't even try to do anything else, it will just burn you. I'm not sure if that is still mandatory. It used to be in the Ant 1.5/.6 days, but I'm a bit out of practice with deep down Ant stuff these days, so consider that advice dated at least and possibly no longer applicable. It is not! You can put it in the lib home and ant will find junit. I have it in my project. I am importing solr build scripts in my project and build it from my project without problem because the junit.jar is in my classpath. As soon I want to build solr directly I can't because the fail ... junit check. However one can place the junit lib into the lib dir and add it to the classpath. I can submit a patch tomorrow if you want. Erik -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
typo diff
Hi all, Index: src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java === --- src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java (revision 576331) +++ src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java (working copy) @@ -97,7 +97,7 @@ // TODO -- in the future, we could pick a different parser based on the request -// Pick the parer from the request... +// Pick the parser from the request... ArrayListContentStream streams = new ArrayListContentStream(1); SolrParams params = parser.parseParamsAndFillStreams( req, streams ); SolrQueryRequest sreq = buildRequestFrom( params, streams ); salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: [VOTE] release rc3 as Solr 1.2
On Thu, 2007-05-31 at 19:02 -0400, Yonik Seeley wrote: Sorry folks... one more time. This release candidate fixes SOLR-250 (scripts need to tell curl the content-type), as well as the minor README typo. Please vote to release the artifacts at http://people.apache.org/~yonik/staging_area/solr/1.2rc3/ as Apache Solr 1.2 +1 Did a quick test. +1 salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
[jira] Commented: (SOLR-238) [Patch] The tutorial on our website is against trunk which causes confusion by user
[ https://issues.apache.org/jira/browse/SOLR-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498364 ] Thorsten Scherler commented on SOLR-238: cheers Hoss! 1) yes, you can change the name. I will add a new version. 2) a) no, you can change it in the forrest.properties: #project.schema-dir=${project.resources-dir}/schema is the default. You can change it to something like project.schema-dir=src/schema if you want, just uncomment the property. b) not sure about the path better use the forrest.properties. 3) As I understand it (used it the first time in this contribution) it links to the *. ent file, giving the benefit that you can import it to your favorite xml editor: http://forrest.apache.org/docs_0_70/catalog.html further (as I understand it) forrest is using it to look up the *.ent file. [Patch] The tutorial on our website is against trunk which causes confusion by user --- Key: SOLR-238 URL: https://issues.apache.org/jira/browse/SOLR-238 Project: Solr Issue Type: Improvement Components: documentation Reporter: Thorsten Scherler Assigned To: Hoss Man Attachments: SOLR-238.diff, SOLR-238.diff, SOLR-238.diff, SOLR-238.png The patch will add a note to the tutorial page with the following headsup: This is documentation for the development version (TRUNK). Some instructions may only work if you are working against svn head. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-238) [Patch] The tutorial on our website is against trunk which causes confusion by user
On Tue, 2007-05-15 at 11:04 +1000, David Crossley wrote: Chris Hostetter wrote: ... that's why i was hoping forrest had a variable substitution mechanism built into it that could just read from some file that we have ant generate. ... is there something like that in forrest? There is a facility for XML Entities ... http://forrest.apache.org/faq.html#xml-entities which refers to a demo and explanation in the 'forrest seed' http://forrest.zones.apache.org/ft/build/forrest-seed/samples/xml-entities.html Get your Ant to create the file symbols-project-v10.ent on each nightly run. -David Thanks David. That is actually a really nice way and IMO the cleanest solution. I changed the patch to do exactly what David is recommending. https://issues.apache.org/jira/browse/SOLR-238 salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
[jira] Updated: (SOLR-238) [Patch] The tutorial on our website is against trunk which causes confusion by user
[ https://issues.apache.org/jira/browse/SOLR-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-238: --- Attachment: SOLR-238.diff Patch of the forrest skinconf.xml [Patch] The tutorial on our website is against trunk which causes confusion by user --- Key: SOLR-238 URL: https://issues.apache.org/jira/browse/SOLR-238 Project: Solr Issue Type: Improvement Components: documentation Reporter: Thorsten Scherler Attachments: SOLR-238.diff The patch will add a note to the tutorial page with the following headsup: This is documentation for the development version (TRUNK). Some instructions may only work if you are working against svn head. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-238) [Patch] The tutorial on our website is against trunk which causes confusion by user
[ https://issues.apache.org/jira/browse/SOLR-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-238: --- Attachment: SOLR-238.png screenshot Find window title changed and two new note boxes. [Patch] The tutorial on our website is against trunk which causes confusion by user --- Key: SOLR-238 URL: https://issues.apache.org/jira/browse/SOLR-238 Project: Solr Issue Type: Improvement Components: documentation Reporter: Thorsten Scherler Attachments: SOLR-238.diff, SOLR-238.png The patch will add a note to the tutorial page with the following headsup: This is documentation for the development version (TRUNK). Some instructions may only work if you are working against svn head. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-238) [Patch] The tutorial on our website is against trunk which causes confusion by user
On Mon, 2007-05-14 at 11:20 -0700, Hoss Man (JIRA) wrote: [ https://issues.apache.org/jira/browse/SOLR-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495706 ] Hoss Man commented on SOLR-238: --- Thorsten ... thanks for the prod on this issue. One thing that makes this tricky is that the tutorial (and the entire website) are bundled with every release ... that's why we keep the site up to date with the trunk, so that people can review the docs as time goes on, but when a release is cut people using that release should refer to the docs that come with it. I'm not very knowledgeable in forest, do you (or anyone else watching this issue) know if there is an easy way to do variable substitution into the generated docs when they are build using property files (or something like it) Then the docs could always contain the current Solr spec version number when the tutorial is regenerated (for official releases, the spec version number looks like 1.1, 1.2, etc... for nightly builds it looks like 1.1.2007.05.11.10.10.53 -- the last official version number followed by the current datetime) Well the quickest way certainly is changing the skinconf.xml by hand. However that will not be possible in the use-cases you describe (for nightly builds). For this case you would need something more sophisticated. To understand it right you would like to build the site with forrest and in the build appears the version number and the name of the dis (ant property ${fullnamever}) of the tutorial. In the solr build.xml we define: !-- make a distribution -- target name=package description=Packages the Solr Distribution files and Documentation. depends=dist, example, javadoc copy todir=${build.docs} fileset dir=site / /copy ... /target One idea was for me to use a filter with the copy task that e.g. @fullnamever@ will be substitute with ${fullnamever}. The problem is that would not be substituted then on the live website. One could replace http://wiki.apache.org/solr/Website_Update_HOWTO step 2 of Website update steps with a target that is doing the filtering for you. Then in forrest run you would find @fullnamever@ but after building the site and using the copy target with filtering true you have the variable substituted. The problem is that the nightly builds would need to build as well the documentation with forrest. Letting forrest do the substitution and import forrest targets into the solr build.xml is a similar approach but then you have an even bigger dependency on forrest. I need to think about it but maybe meanwhile somebody on forrest-dev (which I cc) has an idea. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
[jira] Commented: (SOLR-133) change XmlUpdateRequestHandler to use StAX instead of XPP
[ https://issues.apache.org/jira/browse/SOLR-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494005 ] Thorsten Scherler commented on SOLR-133: What is missing with this issue, where can I give a helping had. change XmlUpdateRequestHandler to use StAX instead of XPP - Key: SOLR-133 URL: https://issues.apache.org/jira/browse/SOLR-133 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-133-XmlUpdateRequestHandler-StAX-139.patch, SOLR-133-XmlUpdateRequestHandler-StAX-139.patch, SOLR-133-XmlUpdateRequestHandler-StAX-139.patch, SOLR-133-XmlUpdateRequestHandler-StAX-139.patch, SOLR-133-XmlUpdateRequestHandler-StAX-139.patch, SOLR-133.diff, SOLR-133.diff there has been discussion of using StAX for XML parsing of updates instead of XPP ... opening an issue to track it as a possible improvement (orriginally mentioned in SOLR-61, but that task was more specificly about refactoring the existing code) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-109) variable substitution in lucene query params
[ https://issues.apache.org/jira/browse/SOLR-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-109: --- Attachment: SOLR-109.diff Hi Hoss, I finally got back to my solr project and to this issue. :) Hopefully I can contribute more from now on, since the project got funding for a couple of months. Please find a complete rewrite of the prior incomplete patch. I tried to stay as close as your instructions as I understood them. The only part that I did not understood is the handlers part. I did a grep for SolrQueryParser but could not find an example where I can call SolrQueryParser.setParams(SolrParams params) to prepare the substitution (setSubstitutionSolrParam(String queryText){}). Can somebody push me into the right direction to finish the patch with a nice test? TIA variable substitution in lucene query params Key: SOLR-109 URL: https://issues.apache.org/jira/browse/SOLR-109 Project: Solr Issue Type: New Feature Reporter: Thorsten Scherler Attachments: SOLR-109.diff, SOLR-109.diff Allowing variable substitution in the lucene query params seems pretty slick ... a more general solution might be to modify the SolrQueryParser directly to have a new void setParamVariables(SolrParams p) method. http://marc.theaimsgroup.com/?t=11671237641r=1w=2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494028 ] Thorsten Scherler commented on SOLR-85: --- Hi Ryan, I just did a quick check of the current trunk and could not found the patch includes (as I understood it from your last comment). How can I help to get the patch into the trunk? [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, SOLR-85-UpdatForms-RequestHandlers.patch, SOLR-85-UpdatForms-RequestHandlers.patch, solr-85-with-104.patch, solr-85-with-104.patch, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [Solr Wiki] Update of LukeRequestHandler by ryan
On Fri, 2007-04-27 at 05:23 +, Apache Wiki wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Solr Wiki for change notification. The following page has been changed by ryan: http://wiki.apache.org/solr/LukeRequestHandler Hi Ryan, I found on the above page: TODO: Anyone who knows XSLT, this would be a great place to contribute! Can you specify where I ca help. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
Re: [jira] Commented: (SOLR-164) Use the SOLR-86 client in examples of the Solr tutorial
On Tue, 2007-02-20 at 10:11 -0800, Chris Hostetter wrote: : I have reverted the website update, as the SOLR-86 client is not : available in a released version yet. We'll have to update the site once : we do a release. i think it's perfectly fine to have the site reflect what's in the trunk; people who the tutorial specific to the 1.1 release can find it in the 1.1 release itself. having the site docs reflect the trunk gives us the advantage of having more people review it earlier (before it gets baked into a release +1 In the current state of solr, version specific documentation on our website does not make much sense. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java XMLconsulting, training and solutions
Re: [jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
On Sun, 2007-02-18 at 17:18 -0800, Hoss Man (JIRA) wrote: [ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12474052 ] Hoss Man commented on SOLR-86: -- you guys are too quick for me... i planed on working on this this weekend, and before i get a chance Bertrand overhauls it and improves the error messages, and erik commits it. :) :) Awesome thanks you guys. thanks again to Thorsten for getting the ball rolling on this. You are welcome, I was busy implementing the first version of Apache Droids and did not had a chance to contribute much to Solr this time, but I just finished droids and after committing it to the labs I will attend my other open issues here on solr. Thanks everyone who worked on this issue. salu2 [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: https://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Assigned To: Erik Hatcher Attachments: simple-post-tool-2007-02-15.patch, simple-post-tool-2007-02-16.patch, simple-post-using-urlconnection-approach.patch, solr-86.diff, solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- Thorsten Scherler thorsten.at.apache.org Open Source Java XMLconsulting, training and solutions
Re: [Solr Wiki] Update of SolrTomcat by sunbomb
On Wed, 2007-02-07 at 15:38 +, Apache Wiki wrote: Dear Wiki user, ... - * Copy the solr.war file from c:\temp\solrZip\dist\ to the Tomcat webapps directory c:\tomcat\webapps\ + * Copy the solr.war file from c:\temp\solrZip\dist\ to the Tomcat lib directory c:\tomcat\lib\ Are you sure? That really looks ought. I am not using win, but the lib instead of webapps dir. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java XML consulting, training and solutions
Re: [Solr Wiki] Update of SolrTomcat by sunbomb
On Wed, 2007-02-07 at 13:52 -0800, Chris Hostetter wrote: Consider the change in context: this is in the Multiple Solr apps section where it then uses context fragments to locate the WAR for multiple instances. in the Single Solr app section, c:\tomcat\webapps\ is used. I don't know that putting the war in tomcat\lib really makes sense (i can't imagine it *needs* to be in the lib dir) but i'm sure that in this case putting it in c:\tomcat\webapps\ would be a bad choice -- because tomcat automaticaly creates a context for anything it finds there right? Only if you have activated this (AFAIR it is set by default to auto deploy). salu2 : Date: Wed, 7 Feb 2007 16:44:40 -0500 : From: Yonik Seeley [EMAIL PROTECTED] : Reply-To: solr-dev@lucene.apache.org : To: solr-dev@lucene.apache.org : Subject: Re: [Solr Wiki] Update of SolrTomcat by sunbomb : : On 2/7/07, Thorsten Scherler [EMAIL PROTECTED] wrote: : On Wed, 2007-02-07 at 15:38 +, Apache Wiki wrote: : Dear Wiki user, : ... : - * Copy the solr.war file from c:\temp\solrZip\dist\ to the Tomcat webapps directory c:\tomcat\webapps\ : + * Copy the solr.war file from c:\temp\solrZip\dist\ to the Tomcat lib directory c:\tomcat\lib\ : : Are you sure? That really looks ought. : : I am not using win, but the lib instead of webapps dir. : : I'm using windows, and webapps is always what I've used for Solr. : : -Yonik : -Hoss -- Thorsten Scherler thorsten.at.apache.org Open Source Java XMLconsulting, training and solutions
Re: Connecting custom RequestHandler
On Sat, 2007-02-03 at 18:41 -0800, Ryan McKinley wrote: The 'new' request parser reads the content type header to see if it should parse the body as params or not. If the content type is application/x-www-form-urlencoded it parses them as params, if it is multipart/* it parses them as multipart. if it is *anything* else, it reads the body as a stream. curl defaults the content type to application/x-www-form-urlencoded so you will need to run curl with -H Content-Type: text/xml - - - - - - - Hmm, I patched the post.sh only regarding the URL. FILES=$* URL=http://localhost:8983/solr/update/stax for f in $FILES; do echo Posting file $f to $URL curl $URL --data-binary @$f -H 'Content-type:text/xml; charset=utf-8' echo done Meaning what you describe is already in there. I now played around again with it and surprise it works now. Thanks Ryan. I should add this to the wiki. Can we change the post.sh to use -H Content-Type: text/xml? This will not affect old updater and will work for new UpdateHandlers. It is see above. Not sure why it did not work the yesterday. Thanks for the feedback. salu2 ryan On 2/3/07, Thorsten Scherler [EMAIL PROTECTED] wrote: On Sat, 2007-02-03 at 18:14 +0100, Thorsten Scherler wrote: Hi all, I am working on SOLR-133 and I have wrapped up a first version of the XmlUpdateRequestHandlerStax.java. Now I am trying to connect it in the example but I have some problems. I am trying: requestHandler name=/update/stax class=solr.XmlUpdateRequestHandlerStax / Trying to curl to URL=http://localhost:8983/solr/update/stax Debugging this I used http://localhost:8983/solr/update/xml and I get the same error. So I figured that we still use the SolrUpdateServlet @Deprecated in the post.sh. For now I can test the StAX changing XmlUpdateRequestHandlerStax legacyUpdateHandler; and implementing a doLegacyUpdate method in the handler. salu2 I get: Posting file solr.xml to http://localhost:8983/solr/update/stax ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime67/int/lst /response html head titleError 400 missing content stream stax/title /head body h2HTTP ERROR: 400/h2premissing content stream stax/pre pRequestURI=/solr/update/stax/p ... What did I forget that the content stream is not passed to the method? Any tip, hint or shoot in the dark welcome. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java XMLconsulting, training and solutions -- Thorsten Scherler thorsten.at.apache.org Open Source Java XMLconsulting, training and solutions
[jira] Updated: (SOLR-133) change XmlUpdateRequestHandler to use StAX instead of XPP
[ https://issues.apache.org/jira/browse/SOLR-133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-133: --- Attachment: SOLR-133.diff Refactoring the XmlUpdateRequestHandler to use constant variables that can be reused by the Stax implementation. Adding a stax implementation for the XmlUpdateRequestHandler. Till now I get an error about missing content stream. NOTE: To make the version compile you need to download the JSR 173 API from http://www.ibiblio.org/maven2/stax/stax-api/1.0/stax-api-1.0.jar and copy it to $SOLR_HOME/lib/. change XmlUpdateRequestHandler to use StAX instead of XPP - Key: SOLR-133 URL: https://issues.apache.org/jira/browse/SOLR-133 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-133.diff there has been discussion of using StAX for XML parsing of updates instead of XPP ... opening an issue to track it as a possible improvement (orriginally mentioned in SOLR-61, but that task was more specificly about refactoring the existing code) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Connecting custom RequestHandler
On Sat, 2007-02-03 at 18:14 +0100, Thorsten Scherler wrote: Hi all, I am working on SOLR-133 and I have wrapped up a first version of the XmlUpdateRequestHandlerStax.java. Now I am trying to connect it in the example but I have some problems. I am trying: requestHandler name=/update/stax class=solr.XmlUpdateRequestHandlerStax / Trying to curl to URL=http://localhost:8983/solr/update/stax Debugging this I used http://localhost:8983/solr/update/xml and I get the same error. So I figured that we still use the SolrUpdateServlet @Deprecated in the post.sh. For now I can test the StAX changing XmlUpdateRequestHandlerStax legacyUpdateHandler; and implementing a doLegacyUpdate method in the handler. salu2 I get: Posting file solr.xml to http://localhost:8983/solr/update/stax ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime67/int/lst /response html head titleError 400 missing content stream stax/title /head body h2HTTP ERROR: 400/h2premissing content stream stax/pre pRequestURI=/solr/update/stax/p ... What did I forget that the content stream is not passed to the method? Any tip, hint or shoot in the dark welcome. salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java XMLconsulting, training and solutions
[jira] Updated: (SOLR-133) change XmlUpdateRequestHandler to use StAX instead of XPP
[ https://issues.apache.org/jira/browse/SOLR-133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-133: --- Attachment: SOLR-133.diff Fixing bugs from first version. Adding workaround for problem with direct use of the handler (never gets a stream). http://www.mail-archive.com/solr-dev@lucene.apache.org/msg02759.html by patching the SolrUpdateServlet Please test, it works fine for me. change XmlUpdateRequestHandler to use StAX instead of XPP - Key: SOLR-133 URL: https://issues.apache.org/jira/browse/SOLR-133 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-133.diff, SOLR-133.diff there has been discussion of using StAX for XML parsing of updates instead of XPP ... opening an issue to track it as a possible improvement (orriginally mentioned in SOLR-61, but that task was more specificly about refactoring the existing code) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-133) change XmlUpdateRequestHandler to use StAX instead of XPP
[ https://issues.apache.org/jira/browse/SOLR-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470001 ] Thorsten Scherler commented on SOLR-133: @Larrea 1) standards-based 2) agree 3) agree 4) agree StAX is become a standard. Not as fast as SAX but nearly. IMO the StAX implementation is as easy to follow as the xpp, personally I think even easier. change XmlUpdateRequestHandler to use StAX instead of XPP - Key: SOLR-133 URL: https://issues.apache.org/jira/browse/SOLR-133 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-133.diff, SOLR-133.diff there has been discussion of using StAX for XML parsing of updates instead of XPP ... opening an issue to track it as a possible improvement (orriginally mentioned in SOLR-61, but that task was more specificly about refactoring the existing code) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-61) move XML update parsing out of SolrCore
[ https://issues.apache.org/jira/browse/SOLR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469450 ] Thorsten Scherler commented on SOLR-61: --- Hi Hoss, I personally would not close this issue, since we have completed one point but not the second one. We have decoupled the XML parsing from SolrCore, but not moved to StAX based parsing. move XML update parsing out of SolrCore --- Key: SOLR-61 URL: https://issues.apache.org/jira/browse/SOLR-61 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Priority: Minor The XML parsing in SolrCore should be decoupled and moved out. We also might consider moving to StAX based parsing, as it is now a standard and will be included in Java6 (Woodstox could be used for Java5). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Searching with accents
On Thu, 2007-02-01 at 16:35 +0100, Manuel Albela Miranda wrote: Thorsten Scherler wrote: On Thu, 2007-02-01 at 12:37 +0100, Manuel Albela Miranda wrote: Hello everybody, Do you know if there is a way to search with and without accents without duplicate a field?. I have a large index (60Gb) and don't want to have two fields with the same content one with accents and the other one without them because this field is the biggest in the index. Again, hope you can help me. Try something like this in your schema.xml: fieldtype name=stringSimilar class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ filter class=solr.ISOLatin1AccentFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ filter class=solr.ISOLatin1AccentFilterFactory/ /analyzer /fieldtype HTH salu2 Thank you very much. Regards. Manu Hi Thorsten, First of all, thank you for your message. I've working around the schema.xml file with the lines you sent me. Now i can filter the query, but the problem is that i have accents in my index so, when i search for words with accents, solr only search for the word without them and i need both of them. I don't know if there is a way to do this. Well, it is not nice but you could use fuzzy search. AKA q=Órden~075 That will find more matches. See recent threads around fuzzy search. The above schema patch is working nice if you update your index (index everything again), but what you would need is to reindex the WHOLE 60Gb. salu2 Regards. Manu. -- Thorsten Scherler thorsten.at.apache.org Open Source Java XML consulting, training and solutions
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469074 ] Thorsten Scherler commented on SOLR-85: --- Hi Ryan, sorry for coming back so late on this, but I need to finish up the first version of a customer project. Anyway, I saw that SOLR-104 is now applied meaning your last patch on this issue should work fine, right. Are they any other blocker on this issue? salu2 [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-61) move XML update parsing out of SolrCore
[ https://issues.apache.org/jira/browse/SOLR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469076 ] Thorsten Scherler commented on SOLR-61: --- Hi all, I am keen to give this issue a go, somebody can give some hints where to start. TIA salu2 move XML update parsing out of SolrCore --- Key: SOLR-61 URL: https://issues.apache.org/jira/browse/SOLR-61 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Priority: Minor The XML parsing in SolrCore should be decoupled and moved out. We also might consider moving to StAX based parsing, as it is now a standard and will be included in Java6 (Woodstox could be used for Java5). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance
[Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance - Key: SOLR-130 URL: https://issues.apache.org/jira/browse/SOLR-130 Project: Solr Issue Type: Task Reporter: Thorsten Scherler While developing a custom search server based on solr I took some notes about the do's and don'ts. The initial patch is not a fully finished document but may invite other devs to enhance it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
JIRA - adding docu component?
Hi all, I wonder whether we could add a docu component to our jira instance? wdyt? salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java XML consulting, training and solutions
[jira] Updated: (SOLR-109) variable substitution in lucene query params
[ https://issues.apache.org/jira/browse/SOLR-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-109: --- Attachment: SOLR-109.diff This is a first start. What still is missing is ... a more general solution might be to modify the SolrQueryParser directly to have a new void setParamVariables(SolrParams p) method. if it's called (with non null input), then any string that SolrQueryParser instance is asked to parse would first be preprocessed looking for the ${} pattern and pulling the values out of the SOlrParams instance. I need to have a closer look on what Hoss means exactly with this. However I get lots of error after an svn up and I am not sure whether my local changes has caused this. variable substitution in lucene query params Key: SOLR-109 URL: https://issues.apache.org/jira/browse/SOLR-109 Project: Solr Issue Type: New Feature Reporter: Thorsten Scherler Attachments: SOLR-109.diff Allowing variable substitution in the lucene query params seems pretty slick ... a more general solution might be to modify the SolrQueryParser directly to have a new void setParamVariables(SolrParams p) method. http://marc.theaimsgroup.com/?t=11671237641r=1w=2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Fwd: Reviving Nutch 0.7
On Mon, 2007-01-22 at 10:13 +0100, Zaheed Haque wrote: -- Forwarded message -- From: Zaheed Haque [EMAIL PROTECTED] Date: Jan 22, 2007 10:13 AM Subject: Re: Reviving Nutch 0.7 To: nutch-dev@lucene.apache.org On 1/22/07, Otis Gospodnetic [EMAIL PROTECTED] wrote: Hi, I've been meaning to write this message for a while, and Andrzej's StrategicGoals made me compose it, finally. Nutch 0.8 and beyond is very cool, very powerful, and once Hadoop stabilizes, it will be even more valuable than it is today. However, I think there is still a need for something much simpler, something like what Nutch 0.7 used to be. Fairly regular nutch-user inquiries confirm this. Nutch has too few developers to maintain and further develop both of these concepts, and the main Nutch developers need the more powerful version - 0.8 and beyond. So, what is going to happen to 0.7? Maintenance mode? I feel that there is enough need for 0.7-style Nutch that it might be worth at least considering and discussing the possibility of somehow branching that version into a parallel project that's not just in a maintenance mode, but has its own group of developers (not me, no time :( ) that pushes it forward. Thoughts? I do not really want to comment on the 0.7 part of this discussion. I agree with you that there is a need for 0.7-style Nutch. I wouldn't say reviving but more Disecting and re-directing :-). here you go --- my focus here is 0.7 style i.e. mid-size, enterprise need. Solr could use a good crawler cos it has everything else .. (AFAIK) probably this is not technically plug an pray :-) also I am not sure Solr community wants a crawler but it could benefit from such Solr add on/snap on crawler. I used forrest/cocoon cli as crawler in a forrest plugin I wrote. I will need to look into the nutch crawler code to see whether we could reuse this code. Not sure how close this is married with the db but I guess pretty close. Furthermore I am sure some of the 0.7 plugins could be re-factored to fit into Solr. The thing about introducing all this plugin into solr we may come pretty soon into the situation the original thread is describing. We may blow the simple one thing that we want to solve to a well defined problem with too much plugins and components. I like to have solr tools that are doing some well defined processes like updating the solr server with crawled content but like said they are IMO tools not really part of solr core. In the end if you want an enhanced search experience via solr with all the filter goodies then you need to add more fields then the once from the e.g. nutch standard xhtml parser. Certain documents allow fine filtering based on additional information this documents may provide (year, type, organization, author, etc.). It is easy to write a single component to update a certain doc type or set of information against solr, but IMO that should not be the focus of main solr development. I think that should go into a tools/ dir. I will forward the mail to Solr community to see if there any interest. Thanks Zaheed. Fits good into the Update Plugins thread. salu2 Cheers -- thorsten Together we stand, divided we fall! Hey you (Pink Floyd)
Re: Solr graduates and joins Lucene as sub-project
On Wed, 2007-01-17 at 10:07 -0500, Yonik Seeley wrote: Solr has just graduated from the Incubator, and has been accepted as a Lucene sub-project! Thanks to all the Lucene and Solr users, contributors, and developers who helped make this happen! Yeah congrats to the whole community and especially to the incubator mentors and first minute solr project members. Thanks for this awesome project. I have a feeling we're just getting started :-) +1 salu2 -Yonik
Re: Can this be achieved? (Was: document support for file system crawling)
On Tue, 2007-01-16 at 16:28 +0100, Eivind Hasle Amundsen wrote: First: Please pardon the cross-post to solr-user for reference. I hope to continue this thread in solr-dev. Please answer to solr-dev. 1) more documentation (and posisbly some locking configuration options) on how you can use Solr to access an index generated by the nutch crawler (i think Thorsten has allready done this) or by Compass, or any other system that builds a Lucene index. Thorsten Scherler? Hmm, I did the exact opposite. Let me explain you my use case. I am working on a part of a portal http://andaluciajunta.es. The new version of http://andaluciajunta.es/BOJA is this part. The current version is based on a proprietary CMS in a dynamic environment. The new development is using Apache Forrest to generate static html. Now coming to solr/nutch, you can find http://andaluciajunta.es/portal/aj-bojaBuscador/0,22815,,00.html the current search engine especially for the BOJA. This will be changed to a solr powered solution. Like I said I only doing one part of the portal and the main portal has a search engine as well. http://andaluciajunta.es/aj-sea-.html This search engine will be based on nutch in the next version. The special character is that this main portal search engine has to search against the solr BOJA based indexed. Meaning Nutch will have to search the solr index and not vice versa. What I did before we decided to go with solr is a simple test. I copied my solr index into a nutch instance and dispatched a couple of queries. The only thing that you need is to keep your solr schema as close as possible to the one nutch uses. For example nutch is using content, url and title as default fields when returning the search result. If you do not have this fields in your solr schema then nutch will return null. Is this code available anywhere? Like stated above it is a couple of lines in the solr schema: field name=title type=string stored=true /field field name=content type=text indexed=true stored=true / field name=url type=string stored=true /field Then you just need to point your nutch instance to this index for searching. The same is true (I guess) for solr searching a nutch index. You could use nutch to update the index, point solr to the index and it should work (if you have defined all field in the schema). Sounds very interesting to me. Maybe someone could ellaborate on the differences between the indexes created by Nutch/Solr/Compass/etc., or point me in the direction of an answer? I am far from being an expert, but actually the only real difference I see is the usage of field names. All indexes could be searched with a raw lucene component (if they are based on the same lucene version) 2) contrib code that runs as it's own process to crawl documents and send them to a Solr server. (mybe it parses them, or maybe it relies on the next item...) Do you know FAST? It uses a step-by-step approach (pipeline) in which all of these tasks are done. Much of it is tuned in a easy web tool. The point I'm trying to make is that contrib code is nice, but a complete package with these possibilities could broaden Solr's appeal somewhat. Hmm, I think like Hoss on this, why do we want do the same work of nutch. If you need a crawler why not use the one from nutch and change some lines? I actually use Forrest as crawler when I generate the new sites, which will then push the content to the solr server via a plugin I developed: http://forrest.apache.org/pluginDocs/plugins_0_80/org.apache.forrest.plugin.output.solr/ 3) Stock update plugins that can each read a raw inputstreams of a some widely used file format (PDF, RDF, HTML, XML of any schema) and have configuration options telling them them what fields in the schema each part of their document type should go in. Exactly, this sounds more like it. But if similar inputstreams can be handled by Nutch, what's the point in using Solr at all? The http API's? In other words, both Nutch and Solr seem to have functionality that enterprises would want. But neither gives you the total solution. Not sure. I am using solr because I did not had to develop three different nutch plugin to make it work. Further I have punctual updates where I push a certain set of documents to the server, so no need for a crawler. Don't get it wrong, I don't want to bloat the products, even though it would be nice to have a crossover solution which is easy to set up. The architecture could look something like this: Connector - Parser - DocProc - (via schema) - Index Possible connectors: JDBC, filesystem, crawler, manual feed Possible parsers: PDF, whatever Both connectors, parsers AND the document processors would be plugins. The DocProcs would typically be adjusted for each enterprise' needs, so that it fits with their schema.xml. Problem is; I haven't worked enough with Solr, Nutch, Lucene etc. to really know all
Java version for solr development (was Re: Update Plugins)
On Tue, 2007-01-16 at 15:49 -0500, Yonik Seeley wrote: On 1/16/07, J.J. Larrea [EMAIL PROTECTED] wrote: - Revise the XML-based update code (broken out of SolrCore into a RequestHandler) to use all the above. +++1, that's been needed forever. If one has the time, I'd also advocate moving to StAX (via woodstox for Java5, but it's built into Java6). I was up to have a look on this. Seeing this comment makes me think. I am on 1.5 ATM and using |-- stax-1.2.0-dev.jar `-- stax-utils.jar Two more dependencies. Setting min version !-- Java Version we are compatible with -- property name=java.compat.version value=1.6 / would get rid of this. Should I use 1.6 for a patch or above mentioned libs? wdyt? salu2 -- thorsten Together we stand, divided we fall! Hey you (Pink Floyd)
[jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465327 ] Thorsten Scherler commented on SOLR-86: --- Yeah, I know what you mean (had a similar problem today). if (!file.isDirectory()){ tool.postFile(file, out); } should fix that. TIA [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: https://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: simple-post-using-urlconnection-approach.patch, solr-86.diff, solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Update Plugins (was Re: Handling disparate data sources in Solr)
On Fri, 2007-01-12 at 15:41 -0500, Yonik Seeley wrote: On 1/10/07, Chris Hostetter [EMAIL PROTECTED] wrote: The one hitch i think to the the notion that updates and queries map cleanlly with something like this... SolrRequestHandler = SolrUpdateHandler SolrQueryRequest = SolrUpdateRequest SolrQueryResponse = SolrUpdateResponse (possibly the same class) QueryResponseWriter = UpdateResponseWriter (possible the same class) ...is that with queries, the input tends to be fairly simple. very generic code can be run by the query Servlet to get all of the input params and build the SolrQueryRequest ... but with updates this isn't quite as simple. there's the two issues i spoke of in my earlier mail which should be independenly confiugable: 1) where does the stream of update data come from? is it in the raw POST body? is it in a POSTed multi-part MIME part? is it a remote resource refrenced by URL? 2) how should the raw binary stream of update data be parsed? is it XML? (in the current update format) is it a CSV file? is it a PDF? ...#2 can be what the SolrUpdateHandler interface is all about -- when hitting the update url you specify a ut (update type) that determines that logic ... but it should be independed of #1 Right, you're getting at issues of why I haven't committed my CSV handler yet. It currently handles reading a local file (this is more like an SQL update handler... only a reference to the data is passed). But I also wanted to be able to handle a POST of the data , or even a file upload from a browser. Then I realized that this should be generic... the same should also apply to XML updates, and potential future update formats like JSON. I do not see the problem here. One just need to add a couple of lines in the upload servlet and change the csv plugin to input stream (not local file). See https://issues.apache.org/jira/secure/attachment/12347425/solar-85.with.file.upload.diff ... +boolean isMultipart = ServletFileUpload +.isMultipartContent(new ServletRequestContext(request)); ... +if (isMultipart) { +// Create a new file upload handler ... +commandReader = new BufferedReader(new InputStreamReader(stream)); Now instead of +core.update(commandReader, responseWriter); one would use the updateHandler for the in the request defined format (format=json) UpdateHandler handler = core.lookupUpdateHandler(format); handler.update(commandReader, responseWriter); Or do I miss something? salu2
[jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464737 ] Thorsten Scherler commented on SOLR-86: --- Hi Hoiss, I had a look at your version and it is good as gold. I personally prefer the httpClient since the method is smaller but Bertrand and ourself are right, the dependency jar price for a simple replacement is ATM too high. The only thing that I would add is directory support: ... + if (srcFile.exists()) { +if (srcFile.isDirectory()) { +File[] fileSet = srcFile.listFiles(); +for (int i = 0; i fileSet.length; i++) { +File file = fileSet[i]; +tool.postFile(file, out); +} else { +tool.postFile(srcFile, out); +} + System.out.println(); +} else { + System.err.println(srcFile + does not exist); +} I agree to your patch as official replacement of the post.sh. I further agree with Bertrand that we may include patch as base demonstration for more complex client apps. [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: https://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: simple-post-using-urlconnection-approach.patch, solr-86.diff, solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-109) variable substitution in lucene query params
variable substitution in lucene query params Key: SOLR-109 URL: https://issues.apache.org/jira/browse/SOLR-109 Project: Solr Issue Type: New Feature Reporter: Thorsten Scherler Allowing variable substitution in the lucene query params seems pretty slick ... a more general solution might be to modify the SolrQueryParser directly to have a new void setParamVariables(SolrParams p) method. http://marc.theaimsgroup.com/?t=11671237641r=1w=2 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Java client library (SOLR-20 / SOLR-30)
On Sat, 2007-01-06 at 23:56 -0800, Ryan McKinley wrote: I just posted a solr client that does search and update to SOLR-20. With the addition of: \solr\client\ruby\solrb it seems appropriate to put this (or equivalent) in: \solr\client\java\solrj This version only depends on xpp3-1.1.3.4.O.jar. It is currently using the java.net.HttpURLConnection. It would not be difficult to change this to jakarta HttpClient if that seems like a better idea. I had a look (very nice good work Ryan) and changing java.net.HttpURLConnection to HttpClient is the only thing I personally did. I did something similar: http://svn.apache.org/viewvc/forrest/trunk/whiteboard/plugins/org.apache.forrest.plugin.output.solr/src/java/org/apache/forrest/http/client/PostFile.java?view=markup I just noticed that I did not commit last night. The public PostFile(String destinationUrl, String srcUrl) is now PostFile(String destinationUrl, InputStream stream). salu2 thanks ryan
Re: SQL UpdatePlugin?
On Wed, 2007-01-10 at 22:32 -0800, Ryan McKinley wrote: I'd like to be able to add/update documents from an SQL query. Perhaps: addFromSQL mode=add or replace fields connection=jdbc:mysql://localhost/nblmc?username=xxxpassword=xxx driver=com.mysql.jdbc.Driver multifieldSeperator=\n SELECT * FROM my_stats_table /addFromSQL This would use the the column names as the field name, and the cell value.toString() as the field value. If the schema says the field can have multiple values AND a multifieldSeperator is defined, it will split the value on that string. To get intended results, you may need to use the 'AS' command and perhaps format the cells using SQL. For example: SELECT itemID AS id, name, DATE_FORMAT( addedTime, '%Y-%m-%dT%H:%i:%s.000Z' ) Should this be an implemented as an Update Plugin? or added directly to the DirectUpdateHandler. If it should be an UpdatePlugin, how do i get started? Hmm, I am not an expert but IMO that should not go directly to the DirectUpdateHandler. Solr is following the push model for updates till now. The above is changing this since now solr is pulling the documents to add from the db. Not saying this is bad or good. I think you should consider something like this: DbToSolrXml.java - this component connects to the db and generates proper solr xml update statement. From here you do as usual. Have a look at https://issues.apache.org/jira/browse/SOLR-66 maybe you can use this. HTH salu2 thanks ryan
Re: [VOTE] graduate Solr to Lucene subproject
On Thu, 2007-01-04 at 15:29 -0500, Yonik Seeley wrote: It's time that Solr graduate from the incubator and become an official Lucene subproject. So, please cast your votes: [ ] +1 ask Lucene PMC and the Incubator PMC to graduate Solr from the Incubator to become a Lucene subproject. [ ] 0 Don't care [ ] -1 Not at this time, stay in the Incubator for now. +1 salu2 -- thorsten Together we stand, divided we fall! Hey you (Pink Floyd)
Re: variable substitution in lucene query params (was Re: filter input from multiple fields)
On Wed, 2006-12-27 at 22:53 -0800, Chris Hostetter wrote: : directly to have a new void setParamVariables(SolrParams p) method. if : it's called (with non null input), then any string that SolrQueryParser : instance is asked to parse would first be preprocessed looking for the ${} : pattern and pulling the values out of the SOlrParams instance. : : : When does the setParamVariables(SolrParams p) get called? What should : happen in this method? i was thinking it would be called by the request handerly just after construction -- it would modify the internal state of the QueryParser just like some of the other setters do., for use in the parse method. : not sure whether I understand. : : You mean bingo ... and then just dd the code to handleRequest that uses substitution if non null. Ok, thanks I will have a look and submit a patch. Thanks for your feedback. salu2 thorsten -Hoss
[jira] Commented: (SOLR-30) Java client code for performing searches against a Solr instance
[ http://issues.apache.org/jira/browse/SOLR-30?page=comments#action_12460863 ] Thorsten Scherler commented on SOLR-30: --- Hi all, I had a look at the code and I do not understand a couple of things. Since the client can request any response format by defining it in the query string I am not sure whether the protected Response createResponse(final String _xml, final ListString _fields) throws SAXException, IOException, ParserConfigurationException, JDOMException { makes so much sense at all. IMO the java client should make it easy to search a solr server with an e.g. custom servlet. This way we could leverage all helper classes to connect to the server into the client. What format will be returned depends on the type defined in the query string that is the reason why I do not thing the JDOM stuff makes sense. Further the different public Response search methods lead is IMO not generic enough, why not simply use public Response search(String _query, ListNameValuePair params) { ... } and returning directly the solr response. Then the calling method would need to deal with the raw response. Java client code for performing searches against a Solr instance Key: SOLR-30 URL: http://issues.apache.org/jira/browse/SOLR-30 Project: Solr Issue Type: New Feature Components: search Reporter: Philip Jacob Priority: Minor Attachments: solrsearcher-client.zip Here are a few classes that connect to a Solr instance to perform searches. Results are returned in a Response object. The Response encapsulates a ListMapString,Field that gives you access to the key data in the results. This is the main part that I'm looking for comments on. There are 2 dependencies for this code: JDOM and Commons HttpClient. I'll remove the JDOM dependency in favor of regular DOM at some point, but I think that the HttpClient dependency is worthwhile here. There's a lot that can be exploited with HttpClient that isn't demonstrated in this class. The purpose here is mainly to get feedback on the API of SolrSearcher before I start optimizing anything. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
filter input from multiple fields
Hi all, I am looking for some information about which is the best way to write a MultipleFieldRequestHandler.java. My use case is that I have a form (the next version of http://andaluciajunta.es/portal/aj-bojaBuscador/0,22815,,00.html) where you limit the query via a couple of different fields. Imaging I have a form like form action=select/ method=get input name=start value=0 type=hidden/ input name=rows value=10 type=hidden/ term: input type=text name=q/ br/ table tr td colspan=3between dates:/td /tr tr td input type=text name=startDate/ /td tdy/td td input type=text name=endDate/ /td /tr /table input type=submit value=buscar/input /form Using the StandardRequestHandler without prior processing would result that startDate and endDate would be ignored since they are not within the query string and are not solr standard param that got processed. Meaning I can either construct the query string (including this filter) on the client side or with a request Handler, right? JavaScript on the client side is not a possibility for my client so here I am to write my first RequestHandler. I could write a request handler special for my client, but I think a more generic solution is more beneficial for all of us. So my idea is to define: requestHandler name=dateRange class=solr.MultipleFieldRequestHandler lst name=defaults str name=echoParamsexplicit/str str name=filterFieldsstartDate endDate/str str name=startDate*/str str name=endDate*/str str name=fqdate:[${startDate} TO ${endDate}]/str /lst /requestHandler Where filterFields defines the array of fields that can be used to limit the super set of docs. If this fields are null for the given request then we use they defined standard e.g. str name=startDate*/str in the fq. Within the fq ${...} has to be parsed with the actual value of the variable. Since we defined default values that should work fine, right? The only thing that a simpler more basic solution could be to just tweak the standard handler to a) get all params and store them in a hashmap (already done) b) change SolrPluginUtils.parseFilterQueries to parse the fq string and replace ${startDate} with the corresponding key=startDate req.getParam(key). General speaking implement a variable substitution parser (which I would have to do as well for the MFRH). What do you think is the better approach write a RequestHandler or extend the standard one or there even an easier way? TIA for any infos. salu2
Re: filter input from multiple fields
On Fri, 2006-12-22 at 13:29 +0100, Thorsten Scherler wrote: Hi all, I am looking for some information about which is the best way to write a MultipleFieldRequestHandler.java. I did a small hack and it works like a charm without the above mentioned handler. I only activated variable substitution for the FQ for testing if you think that is a nice feature I can activate it for the rest. Index: src/java/org/apache/solr/util/SolrPluginUtils.java === --- src/java/org/apache/solr/util/SolrPluginUtils.java (revision 489649) +++ src/java/org/apache/solr/util/SolrPluginUtils.java (working copy) @@ -59,7 +59,10 @@ */ public class SolrPluginUtils { - /** + private static final String VARIABLE_DETERMINATOR_CLOSE = }; + private static final String VARIABLE_DETERMINATOR_OPEN = ${; + +/** * Set defaults on a SolrQueryRequest. * * RequestHandlers can use this method to ensure their defaults are @@ -819,13 +822,24 @@ SolrQueryParser qp = new SolrQueryParser(s.getSchema(), null); for (String q : in) { if (null != q 0 != q.trim().length()) { -out.add(qp.parse(q)); +out.add(qp.parse(substitute(q, req))); } } return out; } - /** + private static String substitute(String q, SolrQueryRequest req) { +if (q.contains(VARIABLE_DETERMINATOR_OPEN)){ +String beforeVariable =q.substring(0,q.indexOf(VARIABLE_DETERMINATOR_OPEN)) ; +String variable = q.substring(q.indexOf(VARIABLE_DETERMINATOR_OPEN)+VARIABLE_DETERMINATOR_OPEN.length(),q.indexOf(VARIABLE_DETERMINATOR_CLOSE)) ; +String afterVariable =q.substring(q.indexOf(VARIABLE_DETERMINATOR_CLOSE)+VARIABLE_DETERMINATOR_CLOSE.length()) ; +String variableValue= req.getParams().get(variable); +q = substitute(beforeVariable+variableValue+afterVariable, req); +} +return q; +} + +/** salu2 My use case is that I have a form (the next version of http://andaluciajunta.es/portal/aj-bojaBuscador/0,22815,,00.html) where you limit the query via a couple of different fields. Imaging I have a form like form action=select/ method=get input name=start value=0 type=hidden/ input name=rows value=10 type=hidden/ term: input type=text name=q/ br/ table tr td colspan=3between dates:/td /tr tr td input type=text name=startDate/ /td tdy/td td input type=text name=endDate/ /td /tr /table input type=submit value=buscar/input /form Using the StandardRequestHandler without prior processing would result that startDate and endDate would be ignored since they are not within the query string and are not solr standard param that got processed. Meaning I can either construct the query string (including this filter) on the client side or with a request Handler, right? JavaScript on the client side is not a possibility for my client so here I am to write my first RequestHandler. I could write a request handler special for my client, but I think a more generic solution is more beneficial for all of us. So my idea is to define: requestHandler name=dateRange class=solr.MultipleFieldRequestHandler lst name=defaults str name=echoParamsexplicit/str str name=filterFieldsstartDate endDate/str str name=startDate*/str str name=endDate*/str str name=fqdate:[${startDate} TO ${endDate}]/str /lst /requestHandler Where filterFields defines the array of fields that can be used to limit the super set of docs. If this fields are null for the given request then we use they defined standard e.g. str name=startDate*/str in the fq. Within the fq ${...} has to be parsed with the actual value of the variable. Since we defined default values that should work fine, right? The only thing that a simpler more basic solution could be to just tweak the standard handler to a) get all params and store them in a hashmap (already done) b) change SolrPluginUtils.parseFilterQueries to parse the fq string and replace ${startDate} with the corresponding key=startDate req.getParam(key). General speaking implement a variable substitution parser (which I would have to do as well for the MFRH). What do you think is the better approach write a RequestHandler or extend the standard one or there even an easier way? TIA for any infos. salu2
[jira] Created: (SOLR-90) Typo in java docs of QueryParsing.java
Typo in java docs of QueryParsing.java --- Key: SOLR-90 URL: http://issues.apache.org/jira/browse/SOLR-90 Project: Solr Issue Type: Improvement Reporter: Thorsten Scherler Index: /home/thorsten/src/apache/solr/src/java/org/apache/solr/search/QueryParsing.java === --- /home/thorsten/src/apache/solr/src/java/org/apache/solr/search/QueryParsing.java (revision 489078) +++ /home/thorsten/src/apache/solr/src/java/org/apache/solr/search/QueryParsing.java (working copy) @@ -408,7 +408,7 @@ * The benefit of using this method instead of calling * codeQuery.toString/code directly is that it knows about the data * types of each field, so any field which is encoded in a particularly - * complex way is still readable. The downside is thta it only knows + * complex way is still readable. The downside is that it only knows * about built in Query types, and will not be able to format custom * Query classes. * /p -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] Resolved: (SOLR-90) Typo in java docs of QueryParsing.java
On Thu, 2006-12-21 at 01:15 -0800, Bertrand Delacretaz (JIRA) wrote: [ http://issues.apache.org/jira/browse/SOLR-90?page=all ] Bertrand Delacretaz resolved SOLR-90. - Resolution: Fixed Fixed, thanks! To you. This issue wins the Shortest Lived Patch In Solr History Award ;-) lol :) yes, you are right. Good on ya mate. salu2 Typo in java docs of QueryParsing.java --- Key: SOLR-90 URL: http://issues.apache.org/jira/browse/SOLR-90 Project: Solr Issue Type: Improvement Reporter: Thorsten Scherler Index: /home/thorsten/src/apache/solr/src/java/org/apache/solr/search/QueryParsing.java === --- /home/thorsten/src/apache/solr/src/java/org/apache/solr/search/QueryParsing.java (revision 489078) +++ /home/thorsten/src/apache/solr/src/java/org/apache/solr/search/QueryParsing.java (working copy) @@ -408,7 +408,7 @@ * The benefit of using this method instead of calling * codeQuery.toString/code directly is that it knows about the data * types of each field, so any field which is encoded in a particularly - * complex way is still readable. The downside is thta it only knows + * complex way is still readable. The downside is that it only knows * about built in Query types, and will not be able to format custom * Query classes. * /p
Patches in jira
Hi all, I noticed many issues in the jira that are patches but never got committed. There are some problems in keeping patches for longer time in the issue tracker: - diff not valid against trunk anymore - patch grow multiple version and packages which makes it harder to follow - person who submitted the patch is not around anymore and cannot give feedback about her submission - duplicating issues by other patches doing similar things - emerging merging problems between this duplicating issues - ... I noticed that we do not have a sandbox (or incubator) directory for new components that need: - more documentation - more testing/feedback - more community support - more ... Maybe it would be a good idea to add new stuff in our own internal incubator and as soon we think it is ready to add it to the core. The benefit over jira is that other people (besides the original author) can easily submit patches (based on the work of the original patch author) against our incubator svn. wdyt? salu2 -- thorsten Together we stand, divided we fall! Hey you (Pink Floyd)
Re: [jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
On Tue, 2006-12-19 at 21:03 -0800, Otis Gospodnetic (JIRA) wrote: [ http://issues.apache.org/jira/browse/SOLR-86?page=comments#action_12459823 ] Otis Gospodnetic commented on SOLR-86: -- Are you working on unifying SOLR-20, SOLR-30, and this into once coherent java client package? I think that would make quite a few people happy. Yeah, I noticed. I actually did not try to work on the unification of both issues, but maybe I find some time (seeing the community demand on this issues) to have a closer look and unify them somehow. Do not hold your breath since I first need to investigate another issue that I have with the solr index and luke. I will report back ASAP. salu2 [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: http://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solr-86.diff, solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update.
Solr index cannot be opened with luke anymore
Hi all, I am using luke (http://www.getopt.org/luke/) for inspecting lucene indexes. When I started to investigate solr I downloaded a nightly (solr-2006-11-27.zip). I played around and quickly after, I started to use trunk. I notice a very weird problem that I do not understand. I tried to open the solr index with luke on trunk and it fails with: java.io.FileNotFoundException: $PATH/solr/data/index/_0.f1 (No such file or directory) I then tried with my old nightly from 11-27 and it works like a charm. Since my local trunk has some code from the issue tracker and some other customization I thought it may be problem of this customization. To test I extended the schema from nightly-11-27 with my custom fields, added some documents and opened (without any problem) the index with luke. Meaning it is not my fields/schema that forces luke to fail. Curious I graped today's nightly and run the example. After adding the solr.xml from the sample docs I try to connect with luke again and get the same error. Notice that the exact same file is missing (_0.f1). Stack: java.io.FileNotFoundException: /home/thorsten/src/apache/apache-solr-nightly-incubating/example/solr/data/index/_0.f1 (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:212) at org.apache.lucene.store.FSIndexInput $Descriptor.init(FSDirectory.java:393) at org.apache.lucene.store.FSIndexInput.init(FSDirectory.java:402) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:287) at org.apache.lucene.index.SegmentReader.openNorms(SegmentReader.java:500) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:157) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:115) at org.apache.lucene.index.IndexReader $1.doBody(IndexReader.java:147) at org.apache.lucene.store.Lock$With.run(Lock.java:109) at org.apache.lucene.index.IndexReader.open(IndexReader.java:140) at org.apache.lucene.index.IndexReader.open(IndexReader.java:135) at org.getopt.luke.Luke.openIndex(Unknown Source) at org.getopt.luke.Luke.startLuke(Unknown Source) at org.getopt.luke.Luke.main(Unknown Source) /home/thorsten/src/apache/apache-solr-nightly-incubating/example/solr/data/index/_0.f1 (No such file or directory) I cannot explain why this suddenly happens (when it works like a charm with the last month code). I even wrote a small lucene index/segment debugging tool (if somebody is interested I can add it to the issue tracker), but that it is not showing any problems, nor does searching and updating. Before I start to review all commits since 11-27 does somebody has a guess why and which commit may cause this issue. Further should I open an issue in our tracker? TIA for any infos. salu2
[jira] Created: (SOLR-88) Solr index cannot be opened with luke anymore
Solr index cannot be opened with luke anymore - Key: SOLR-88 URL: http://issues.apache.org/jira/browse/SOLR-88 Project: Solr Issue Type: Bug Reporter: Thorsten Scherler http://marc.theaimsgroup.com/?l=solr-devm=116661341524556w=2 ...I notice a very weird problem that I do not understand. I tried to open the solr index with luke on trunk and it fails with: java.io.FileNotFoundException: $PATH/solr/data/index/_0.f1 (No such file or directory)... I am using lukeall.jar version 0.6. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-88) Solr index cannot be opened with luke anymore
[ http://issues.apache.org/jira/browse/SOLR-88?page=comments#action_12459920 ] Thorsten Scherler commented on SOLR-88: --- I think found the revision: svn up -r487364 . Before this revision I can open the index with luke. After this commit I cannot do it anymore. Seeing the changes it seems that @omitNorms=true is svn diff -r487340:487364 Index: example/solr/conf/schema.xml === --- example/solr/conf/schema.xml(revision 487340) +++ example/solr/conf/schema.xml(revision 487364) @@ -47,10 +47,10 @@ limits compression (if enabled in the derived fields) to values which exceed a certain size (in characters). -- -fieldtype name=string class=solr.StrField sortMissingLast=true/ +fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ !-- boolean type: true or false -- -fieldtype name=boolean class=solr.BoolField sortMissingLast=true/ +fieldtype name=boolean class=solr.BoolField sortMissingLast=true omitNorms=true/ !-- The optional sortMissingLast and sortMissingFirst attributes are currently supported on types that are sorted internally as strings. @@ -69,20 +69,20 @@ !-- numeric field types that store and index the text value verbatim (and hence don't support range queries, since the lexicographic ordering isn't equal to the numeric ordering) -- -fieldtype name=integer class=solr.IntField/ -fieldtype name=long class=solr.LongField/ -fieldtype name=float class=solr.FloatField/ -fieldtype name=double class=solr.DoubleField/ +fieldtype name=integer class=solr.IntField omitNorms=true/ +fieldtype name=long class=solr.LongField omitNorms=true/ +fieldtype name=float class=solr.FloatField omitNorms=true/ +fieldtype name=double class=solr.DoubleField omitNorms=true/ !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- -fieldtype name=sint class=solr.SortableIntField sortMissingLast=true/ -fieldtype name=slong class=solr.SortableLongField sortMissingLast=true/ -fieldtype name=sfloat class=solr.SortableFloatField sortMissingLast=true/ -fieldtype name=sdouble class=solr.SortableDoubleField sortMissingLast=true/ +fieldtype name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ +fieldtype name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ +fieldtype name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ +fieldtype name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ !-- The format for this date field is of the form 1995-12-31T23:59:59Z, and @@ -105,7 +105,7 @@ Consult the DateField javadocs for more information. -- -fieldtype name=date class=solr.DateField sortMissingLast=true/ +fieldtype name=date class=solr.DateField sortMissingLast=true omitNorms=true/ !-- solr.TextField allows the specification of custom text analyzers specified as a tokenizer and a list of token filters. Different @@ -183,24 +183,25 @@ fields !-- Valid attributes for fields: - name: mandatory - the name for the field - type: mandatory - the name of a previously defined type from the types section - indexed: true if this field should be indexed (searchable) - stored: true if this field should be retrievable - compressed: [false] if this field should be stored using gzip compression - (this will only apply if the field type is compressable; among -the standard field types, only TextField and StrField are) - multiValued: true if this field may contain multiple values per document - omitNorms: (expert) set to true to omit the norms associated with this field - (this disables length normalization and index-time boosting for the field) - + name: mandatory - the name for the field + type: mandatory - the name of a previously defined type from the types section + indexed: true if this field should be indexed (searchable or sortable) + stored: true if this field should be retrievable + compressed: [false] if this field should be stored using gzip compression + (this will only apply if the field type is compressable; among + the standard field types, only TextField and StrField are) + multiValued: true if this field may contain multiple values per document + omitNorms: (expert) set to true to omit the norms associated with + this field (this disables length normalization and index-time + boosting for the field, and saves some memory
[jira] Commented: (SOLR-88) Solr index cannot be opened with luke anymore
[ http://issues.apache.org/jira/browse/SOLR-88?page=comments#action_12459926 ] Thorsten Scherler commented on SOLR-88: --- Removing omitNorms=true from the schema.xml fixes this issue. Solr index cannot be opened with luke anymore - Key: SOLR-88 URL: http://issues.apache.org/jira/browse/SOLR-88 Project: Solr Issue Type: Bug Reporter: Thorsten Scherler Attachments: commits.since.11-27.log http://marc.theaimsgroup.com/?l=solr-devm=116661341524556w=2 ...I notice a very weird problem that I do not understand. I tried to open the solr index with luke on trunk and it fails with: java.io.FileNotFoundException: $PATH/solr/data/index/_0.f1 (No such file or directory)... I am using lukeall.jar version 0.6. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-88) Solr index cannot be opened with luke anymore
[ http://issues.apache.org/jira/browse/SOLR-88?page=comments#action_12459937 ] Thorsten Scherler commented on SOLR-88: --- Cheers Erik. That is good to know. Solr index cannot be opened with luke anymore - Key: SOLR-88 URL: http://issues.apache.org/jira/browse/SOLR-88 Project: Solr Issue Type: Bug Reporter: Thorsten Scherler Attachments: commits.since.11-27.log http://marc.theaimsgroup.com/?l=solr-devm=116661341524556w=2 ...I notice a very weird problem that I do not understand. I tried to open the solr index with luke on trunk and it fails with: java.io.FileNotFoundException: $PATH/solr/data/index/_0.f1 (No such file or directory)... I am using lukeall.jar version 0.6. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: preferred xml parser
On Wed, 2006-12-20 at 10:18 -0500, Yonik Seeley wrote: On 12/20/06, Thorsten Scherler [EMAIL PROTECTED] wrote: looking at the source code I wonder whether we have a preferred xml parser model? I mean I can find: - pull parsing - DOM - JDOM (at least in some jira patches) IMO, DOM xpath is good for config. xpath is easy, flexible and less error prone, and we aren't concerned with performance for reading config. Agree, for the config. Pull parsing (StAX) for anything performance critical. XPP was used at the start, but I think there is a longer term plan to go with StAX. http://www.nabble.com/XPP-license-tf1468633.html#a3977357 ok SAX I have not seen yet and neither StAX. I made some very good experience with StAX lately it is fast and easy to use. Do we plan to recommend one technique (at least for the core)? Do we have plans to create an interface for a SolrDocumentFactory? This way we could have various underlying implementation returning always the same: xml. I ask because I may look into SOLR-20 and SOLR-30 and would like to use StAX as underlying parser. +1 for StAX as the default XML parser. For a general Java client though, I'd try and make it flexible enough to get at the underlying data stream so someone could use another parser if they so desire (or different syntaxes such as JSON). I will think of something. I reckon an good defined interface will do and I would provide the default implementation based on StAX. Cheers for the feedback. salu2 -Yonik
[jira] Created: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: http://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ http://issues.apache.org/jira/browse/SOLR-86?page=comments#action_12459385 ] Thorsten Scherler commented on SOLR-86: --- Index: src/tools/updater/lib/commons-httpclient-3.0.1.jar Index: src/tools/updater/lib/commons-codec-1.3.jar Index: src/tools/updater/lib/commons-logging-1.1.jar see http://jakarta.apache.org/commons/httpclient/dependencies.html for the download pages. After applying the patch and adding above libs to src/tools/updater/lib you can install the updater with: ant updater.jar and follow the instructions given in this target. [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: http://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-85) [PATCH] Add update form to the admin screen
[ http://issues.apache.org/jira/browse/SOLR-85?page=all ] Thorsten Scherler updated SOLR-85: -- Attachment: solar-85.with.file.upload.diff The solar-85.with.file.upload.diff is further extending the current patch and makes the old diffs obsolete. I sadly have no power to remove the obsolete attachments. Sorry. This patch is providing a way to upload a update document. On Fri, 2006-12-15 at 04:57 +0100, Zaheed Haque wrote: ... Maybe you can add a file upload button. what I mean is that lets say you have a file data.xml or data.txt or data.tar.gzip (maybe gzip or tar format can be done later) with many solr records..like .. doc /doc etc.. more.. doc /doc Then you could uplaod that file and presto you updated the index.. that would be cool. I still would like to have the current textarea box its cool to be able to delete a doc/doc directly ot update a doc directly.. The current implementation is assuming the uploaded file is an xml document. [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: http://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.with.file.upload.diff, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ http://issues.apache.org/jira/browse/SOLR-85?page=comments#action_12458779 ] Thorsten Scherler commented on SOLR-85: --- @Hoss yeah I understand your concerns and actually the update form is a wee bit apart from the rest of the forms in admin area. It is more a general approach rather then focused on the examples. Maybe this whole patch good be better packaged as plugin but I am still new to solr and I am not sure how I good patch the web.xml with a plugin. Any pointers appreciated. The next thing on my list is to write a small cli based on httpclient to send the update docs as alternative of the post.sh. BTW the patch needs lib/commons-fileupload-1.1.1.jar - http://jakarta.apache.org/site/downloads/downloads_commons-fileupload.cgi lib/commons-io-1.2.jar - http://jakarta.apache.org/site/downloads/downloads_commons-io.cgi [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: http://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.with.file.upload.diff, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-85) [PATCH] Add update form to the admin screen
[ http://issues.apache.org/jira/browse/SOLR-85?page=all ] Thorsten Scherler updated SOLR-85: -- Attachment: solar-85.png New screenshot including the new features. [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: http://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-83) DOCU: Quickstart note about how to make the example working in src version
DOCU: Quickstart note about how to make the example working in src version --- Key: SOLR-83 URL: http://issues.apache.org/jira/browse/SOLR-83 Project: Solr Issue Type: Improvement Reporter: Thorsten Scherler Priority: Minor -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-83) DOCU: Quickstart note about how to make the example working in src version
[ http://issues.apache.org/jira/browse/SOLR-83?page=all ] Thorsten Scherler updated SOLR-83: -- Attachment: SOLR-83.diff patch for $SOLR_HOME/README.txt DOCU: Quickstart note about how to make the example working in src version -- Key: SOLR-83 URL: http://issues.apache.org/jira/browse/SOLR-83 Project: Solr Issue Type: Improvement Reporter: Thorsten Scherler Priority: Minor Attachments: SOLR-83.diff -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-85) [PATCH] Add update form to the admin screen
[PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: http://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-85) [PATCH] Add update form to the admin screen
[ http://issues.apache.org/jira/browse/SOLR-85?page=all ] Thorsten Scherler updated SOLR-85: -- Attachment: solar-85.png Screenshot of the extended admin screen [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: http://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ http://issues.apache.org/jira/browse/SOLR-85?page=comments#action_12458470 ] Thorsten Scherler commented on SOLR-85: --- This new feature allows you to update your solr instance via web. For your convenience I add the a commit button to commit directly afterwards. Just add your update statement into the form and submit. Try with add doc field name=idSP2514N/field field name=nameSamsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133/field field name=manuSamsung Electronics Co. Ltd./field field name=catelectronics/field field name=cathard drive/field field name=features7200RPM, 8MB cache, IDE Ultra ATA-133/field field name=featuresNoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor/field field name=price92/field field name=popularity6/field field name=inStocktrue/field /doc /add [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: http://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Update via web (was Re: [jira] Updated: (SOLR-85) [PATCH] Add update form to the admin screen)
Hi all, can somebody remove the former attachments from 14/Dec/06 04:34 AM and 14/Dec/06 04:36 AM from http://issues.apache.org/jira/browse/SOLR-85. The only valid patch is solr-85.FINAL.diff TIA This new feature allows you to update your solr instance via web. For your convenience I add the a commit button to commit directly afterwards. Just add your update statement into the form and submit. Try with add doc field name=idSP2514N/field field name=nameSamsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133/field field name=manuSamsung Electronics Co. Ltd./field field name=catelectronics/field field name=cathard drive/field field name=features7200RPM, 8MB cache, IDE Ultra ATA-133/field field name=featuresNoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor/field field name=price92/field field name=popularity6/field field name=inStocktrue/field /doc /add Hope this might be useful. salu2 -- thorsten Together we stand, divided we fall! Hey you (Pink Floyd)