Streaming Expressions and R
As Solr's Streaming Expression support grows more statistical and data analysis functionality, I thought it would be useful to have first-class support in R. To this end, I've begun an R package for this here: https://github.com/jdyer1/R-solr-stream At this point, this new package allows users to execute a streaming expression and get it into an R data.frame. Likewise an R object can be streamed to solr. This has obvious overlaps with the existing "rsolr" package. However, the existing package does not, best I can tell, support streaming expressions. Also, we already can have support via the jdbc driver. My question is whether or not an effort along these lines is worthwhile, and if so, what future direction it should take. I appreciate any feedback. James Dyer Ingram Content Group
RE: [JENKINS-EA] Lucene-Solr-6.x-Linux (32bit/jdk-9-ea+168) - Build # 3502 - Still Failing!
I committed the updated hsqldb jar today, along with its sha1. Pre-commit passes for me. Does anyone know why Jenkins is complaining and if there is anything I must do to fix this? James Dyer Ingram Content Group -Original Message- From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de] Sent: Friday, May 12, 2017 1:33 PM To: dev@lucene.apache.org Subject: [JENKINS-EA] Lucene-Solr-6.x-Linux (32bit/jdk-9-ea+168) - Build # 3502 - Still Failing! Importance: Low Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/3502/ Java: 32bit/jdk-9-ea+168 -client -XX:+UseG1GC All tests passed Build Log: [...truncated 49636 lines...] BUILD FAILED /home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:775: The following error occurred while executing this line: /home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:655: The following error occurred while executing this line: /home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:643: Source checkout is modified!!! Offending files: * solr/licenses/hsqldb-2.4.0.jar.sha1 Total time: 63 minutes 5 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts [WARNINGS] Skipping publisher since build result is FAILURE Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: JDBCStream and loading drivers
Thank you for the quick replies. I can see how it would be powerful to be able to execute streaming expressions outside of solr, giving yourself the option of moving some of the work to the client. I wouldn't necessarily tie it into core because being able to join a solr stream with a rdbms result -- either within solr, or in your driver program -- that could be a nice set of options to have. But the patch on SOLR-1015 seems to get this right in (it seems from a quick look) that it uses the core's classloader when it is available, and falls back when it is not. It might be nice -- especially as the streaming code base grows -- to consider packaging it separately from the solrj client itself. Along these lines: I was initially confused by the examples in https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions in that the cURL example at the top is materially different from the SolrJ example following it. That is, with the cURL example, all of the work occurs in Solr and only the final result is streamed back. With the SolrJ example, some of that work is now being done in the client. This is easy to discover if you try the JDBC expression: following the cURL example, the query originates in Solr ; on the SolrJ example, the query originates on the client -- the server has no involvement at all. Is my understanding here correct? I can see how this design has great advantage as it gives us the ability to write driver programs that use the solr cores as worker nodes. But this wasn't immediately clear to me. I also wonder: do we have an (easy) way with SolrJ currently to simply execute a (chain of) streaming expression(s) and get the result back, like in the cURL example (besides using JDBC)? James Dyer Ingram Content Group From: Joel Bernstein [mailto:joels...@gmail.com] Sent: Tuesday, April 25, 2017 6:25 PM To: lucene dev <dev@lucene.apache.org> Subject: Re: JDBCStream and loading drivers There are a few stream impl's that have access to SolrCore (ClassifyStream, AnalyzeEvaluator) because they use analyzers. These classes have been added to core. We could move the JdbcStream to core as well if it makes the user experience nicer. Originally the idea was that you could run the Streaming API Java classes like you would other Solrj clients. I think over time this may become important again, as I believe there is work underway for spinning up worker nodes that are not attached to a SolrCore. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Apr 25, 2017 at 3:25 PM, Dyer, James <james.d...@ingramcontent.com<mailto:james.d...@ingramcontent.com>> wrote: Using JDBCStream, Solr cannot find my database driver if I put the .jar in the shared lib directory ($SOLR_HOME/lib). In order for the classloader to find it, the driver has to be in the server's lib directory. Looking at why, I see that to get the full classpath, including what is in the shared lib directory, we'd typically get a reference to a SolrCore, call "getResourceLoader" and then "findClass". This makes use of the URLClassLoader that knows about the shared lib. But fixing JDBCStream to do this might not be so easy? Best I can tell, Streaming Expressions are written nearly stand-alone as client code that merely executes in the Solr JVM. Is this correct? Indeed, the code itself is included with the client, in the SolrJ package, despite it mostly being server-side code … Maybe I misunderstand? On the one hand, it isn't a huge deal as to where you need to put your drivers to make this work. But on the other hand, it isn't really the best user experience, in my opinion at least, to have to dig around the server directories to find where your driver needs to go. And also, if this is truly server-side code, why do we ship it with the client jar? Unless there is a desire to make a stand-alone Streaming Expression engine that interacts with Solr as a client, would it be acceptable to somehow expose the SolrCore to it for loading resources like this? James Dyer Ingram Content Group
JDBCStream and loading drivers
Using JDBCStream, Solr cannot find my database driver if I put the .jar in the shared lib directory ($SOLR_HOME/lib). In order for the classloader to find it, the driver has to be in the server's lib directory. Looking at why, I see that to get the full classpath, including what is in the shared lib directory, we'd typically get a reference to a SolrCore, call "getResourceLoader" and then "findClass". This makes use of the URLClassLoader that knows about the shared lib. But fixing JDBCStream to do this might not be so easy? Best I can tell, Streaming Expressions are written nearly stand-alone as client code that merely executes in the Solr JVM. Is this correct? Indeed, the code itself is included with the client, in the SolrJ package, despite it mostly being server-side code ... Maybe I misunderstand? On the one hand, it isn't a huge deal as to where you need to put your drivers to make this work. But on the other hand, it isn't really the best user experience, in my opinion at least, to have to dig around the server directories to find where your driver needs to go. And also, if this is truly server-side code, why do we ship it with the client jar? Unless there is a desire to make a stand-alone Streaming Expression engine that interacts with Solr as a client, would it be acceptable to somehow expose the SolrCore to it for loading resources like this? James Dyer Ingram Content Group
RE: Speculating about the removal of the standalone Solr mode
I would think it unfortunate if this ever happens. Solr in non-cloud mode is simple, easy-to-understand, has few moving parts. Many installations do not need to shard, have real-time-updates, etc. Using the replication handler in "legacy mode" works great for us. The config files are on the filesystem. You need not learn a cli to interact with zookeeper, etc. I would be scared to death running cloud mode in Production if I didn't first obtain an in-depth understanding of zookeeper internals. I can see if there is a huge burden imposed here and if almost all use-cases require cloud. But as for "api consolidation", there are few api's you need to learn if running non-cloud. So what stops us from focusing apis on the need of cloud installations? And the documentation for non-cloud ought to be simple to maintain, there's so much less to learn and know. For those of you that work as consultants or for support providers, it may seem that everyone is running cloud mode. But my guess is those who run cloud mode are the ones that cannot get by without your services. James Dyer Ingram Content Group -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Wednesday, March 09, 2016 11:34 AM To: dev@lucene.apache.org Subject: Speculating about the removal of the standalone Solr mode I've been thinking about the fact that standalone and cloud modes in Solr are very different. The writing on the wall suggests that Solr will eventually (probably 7.0 minimum) eliminate the standalone mode and always operate with zookeeper. A "standalone" node would in fact be a single-node cloud running the embedded zookeeper. Once zk-as-truth becomes a reality, I can see a few advantages to always running in cloud mode. The documentation can include one way to accomplish basic tasks. The CoreAdmin API can be eliminated, and any required functionality fully merged into the Collections API. CloudSolrClient will work for all installations. A script that works for cloud mode will also work for standalone mode, because that's just a smaller cloud. I was planning to open an issue to discuss and implement this. If that's not a good idea, please let me know. None of my main Solr installations are running in cloud mode, so the removal of standalone mode will be an inconvenience for me, but I still think it's the right thing to do in the long term. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Lucene/Solr git mirror will soon turn off
I know Infra has tried a number of things to resolve this, to no avail. But did we try "git-svn --revision=" to only mirror "post-LUCENE-3930" (ivy, r1307099)? Or if that's not lean enough for the git-svn mirror to work, then cut off when 4.x was branched or whenever. The hope would be to give git users enough of the past that it would be useful for new development but then also we can retain the status quo with svn (which is the best path for a 26-day timeframe). James Dyer Ingram Content Group -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Friday, December 04, 2015 2:58 PM To: Lucene/Solr dev Cc: infrastruct...@apache.org Subject: Lucene/Solr git mirror will soon turn off Hello devs, The infra team has notified us (Lucene/Solr) that in 26 days our git-svn mirror will be turned off, because running it consumes too many system resources, affecting other projects, apparently because of a memory leak in git-svn. Does anyone know of a link to this git-svn issue? Is it a known issue? If there's something simple we can do (remove old jars from our svn history, remove old branches), maybe we can sidestep the issue and infra will allow it to keep running? Or maybe someone in the Lucene/Solr dev community with prior experience with git-svn could volunteer to play with it to see if there's a viable solution, maybe with command-line options e.g. to only mirror specific branches (trunk, 5.x)? Or maybe it's time for us to switch to git, but there are problems there too, e.g. we are currently missing large parts of our svn history from the mirror now and it's not clear whether that would be fixed if we switched: https://issues.apache.org/jira/browse/INFRA-10828 Also, because we used to add JAR files to svn, the "git clone" would likely take several GBs unless we remove those JARs from our history. Or if anyone has any other ideas, we should explore them, because otherwise in 26 days there will be no more updates to the git mirror of Lucene and Solr sources... Thanks, Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_66) - Build # 5439 - Failure!
I'm looking at this failure. I cannot reproduce this on Linux using: ant test -Dtests.class="*.SpellCheckComponentTest" -Dtests.seed=110D525A21D16B1:8944EAFF0CE17B49 Tomorrow I will try this on Windows. James Dyer Ingram Content Group -Original Message- From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de] Sent: Wednesday, December 02, 2015 12:53 PM To: ans...@apache.org; mikemcc...@apache.org; sha...@apache.org; romseyg...@apache.org; dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_66) - Build # 5439 - Failure! Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/5439/ Java: 64bit/jdk1.8.0_66 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.handler.component.SpellCheckComponentTest.test Error Message: List size mismatch @ spellcheck/suggestions Stack Trace: java.lang.RuntimeException: List size mismatch @ spellcheck/suggestions at __randomizedtesting.SeedInfo.seed([110D525A21D16B1:8944EAFF0CE17B49]:0) at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:837) at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:784) at org.apache.solr.handler.component.SpellCheckComponentTest.test(SpellCheckComponentTest.java:96) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
RE: Solr Spell checker for non-english language
Safat, DirectSolrSpellChecker defaults to Levenshtein Distance to determine how closely related the query terms are versus the actual terms in the index. (see https://en.wikipedia.org/wiki/Levenshtein_distance) . This is not an English-specific metric and it works for many languages. Assuming this is not appropriate for the Bangla language (sorry for my ignorance!), you might need to implement your own Distance metric, implementing the StringDistance interface. You can specify your custom class using the distanceMeasure parameter under the SpellCheckComponent entry in solrconfig.xml: searchComponent name=spellcheck class=solr.SpellCheckComponent lst name=spellchecker str name=classnamesolr.DirectSolrSpellChecker/str str name=distanceMeasurefully.qualified.classname.here/str .. etc .. /lst /searchComponent For more information, see: http://lucene.apache.org/core/5_2_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html#setDistance%28org.apache.lucene.search.spell.StringDistance%29 Finally, if misplaced whitespace in the query are a problem in the Bangla, you may wish to consider using WordBreakSolrSpellchecker in conjunction with DirectSolrSpellChecker to correct these problems also. See the main Solr example solrconfig.xml for more information. (https://github.com/apache/lucene-solr/blob/branch_5x/solr/example/files/conf/solrconfig.xml) James Dyer Ingram Content Group From: Safat Siddiqui [mailto:safat...@gmail.com] Sent: Monday, July 06, 2015 10:06 PM To: dev@lucene.apache.org Subject: Solr Spell checker for non-english language Hello, I am using Solr version 4.10.3 and trying to customize it for bangla language. I have already built a Bangla language stemmer for Solr indexing: It works fine. Now I like to use Solr spell checker and suggestion functionality for Bangla language. Which section in DirectSolrSpellChecker should I modify? I can not find which section is causing the difference between English and Non-english language. A direction will be very helpful for me. Thanks in advance. Regards, Safat -- Thanks, Safat Siddiqui Student Department of CSE Shahjalal University of Science and Technology Sylhet, Bangladesh.
RE: [VOTE] Move trunk to Java 8
+1 to stay on 1.7, from another small scale committer. I do not see any compelling reason to upgrade to 1.8, except as Benson says for minor programming conveniences. My company would be one of the ones you'd be shutting out if we were on 1.8 now. (Some of our apps upgraded to 1.7 this year.) Of course there is 4.x, but what is 1.8 going to buy 5.x that you're willing to significantly shrink the potential user base? James Dyer Ingram Content Group (615) 213-4311 From: Benson Margulies [mailto:bimargul...@gmail.com] Sent: Friday, September 12, 2014 3:45 PM To: dev@lucene.apache.org Subject: Re: [VOTE] Move trunk to Java 8 Corporate overlords isn't helpful. Lucene is what it is because of its wide adoption. That includes big, small, smart, and stupid organizations. I don't think that an infrastructure component like Lucene needs to be 'ahead of the curve'. It should aim to be widely adoptable. To me, that means moving to a new Java requirement after we observe it is semi-ubiquitous. If 1.8 offered some game-changing JVM feature that would allow a giant leap forward in Lucene, then that would be different. So far, all I see are some minor programming conveniences. However, I'm just one very small scale committer, and I've consumed enough oxygen on this topic.
RE: Adding Morphline support to DIH - worth the effort?
Alexandre, I think that writing a new entity processor for DIH is a much less risky thing to commit than, say, SOLR-4799. Entity Processors work as plug-ins and they aren't likely to break anything else. So a Morphline EntityProcessor is much more likely to be evaluated and committed. But like anything else, you're going to need to explain what the need is and what this new e.p. buys the user community. There needs to be unit tests, etc. Besides this, if you can show how a morphline e.p. can be a step towards migrating away from DIH entirely, then that would be a plus. Perhaps create a new solr example along the lines of the dih solr example that demonstrates to users this new way forward. This would go a long way in convincing the community we have a viable alternative to dih. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Tuesday, June 10, 2014 9:55 PM To: dev@lucene.apache.org Subject: Re: Adding Morphline support to DIH - worth the effort? Ripples in the pond again. Spreading and dying. Understandable, but still somewhat annoying. So, what would be the minimal viable next step to move this conversation forward? Something for 4.11 as opposed to 5.0? Anyone with commit status has a feeling of what - minimal - deliverable they would put their own weight behind? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Mon, Jun 9, 2014 at 10:50 AM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: One of the ideas over DIH discussed earlier is making it standalone. Yeah; my beef with the DIH is that it’s tied to Solr. But I’d rather see something other than the DIH outside Solr; it’s not worthy IMO. Why have something Solr specific even? A great pipeline shouldn’t tie itself to any end-point. There are a variety of solutions out there that I tried. There are the big 3 open-source ETLs: Kettle, Clover, Talend) and they aren’t quite ideal in one way or another. And Spring-Integration. And some half-baked data pipelines like OpenPipe Open Pipeline. I never got around to taking a good look at Findwise’s open-sourced Hydra but I learned enough to know to my surprise it was configured in code versus a config file (like all the others) and that's a big turn-off to me. Today I read through most of the Morphlines docs and a few choice source files and I’m super-impressed. But as you note it’s missing a lot of other stuff. I think something great could be built using it as a core piece. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sun, Jun 8, 2014 at 5:51 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Jack, I found your considerations quite reasonable. One of the ideas over DIH discussed earlier is making it standalone. So, if we start from simple Morphline UI, we can do this extraction. Then, such externalized ETL, will work better with Solr Cloud than DIH works now. Presumably we can reuse DIH Jdbc Datasources as a source for Morphline records. Still open questions in this approach are: - joins/caching - seem possible with Morphlines but still there is no such command - delta import - scenario we don't need to forget to handle it - threads (it's completely out Morphline's concerns) - distributed processing - it would be great if we can partition datasource eg something what's done by Scoop ... what else? On Sun, Jun 8, 2014 at 6:54 PM, Jack Krupansky j...@basetechnology.com wrote: I've avoided DIH like the plague since it really doesn't fit well in Solr, so I'm still baffled as to why you think we need to use DIH as the foundation for a Solr Morphlines project. That shouldn't stop you, but what's the big impediment to taking a clean slate approach to Morphlines - learn what we can from DIH, but do a fresh, clean Solr 5.0 implementation that is not burdened from the get-go with all of DIH's baggage? Configuring DIH is one of its main problems, so blending Morphlines config into DIH config would seem to just make Morphlines less attractive than it actually is when viewed by itself. You might also consider how ManifoldCF (another Apache project) would integrate with DIH and Morphlines as well. I mean, the core use case is ETL from external data sources. And how all of this relates to Apache Flume as well. But back to the original, still unanswered, question: Why use DIH as the starting point for integrating Morphlines with Solr - unless the goal is to make Morphlines unpalatable and less approachable than even DIH itself?! Another question: What does Elasticsearch have in this area (besides rivers)? Are they headed in the Morphlines direction as well? -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Sunday, June 8,
RE: [Apache Solr] Filter query Suggester and Spellchecker
Alessandro, The spellcheck.collate feature already supports this by specifying spellcheck.maxCollationTries greater than zero. This is useful both to prevent unauthorized access to data and also to guarantee that suggested collations will return some results. But maxCollationTries accomplishes this by running the proposed collation queries against the index. If you are interested in preventing unauthorized access only, then you can probably get better performance with a lower-level filter on the term level. There is currently no way to filter the single-term suggestions. I could see this as a nice enhancement, but given the current maxCollationTries support, it may have a pretty narrow use-case. I've also thought about moving all the collate functionality to the Lucene level, so that clients other than Solr can take advantage of it. Perhaps something along the lines of your proposal could be a work in that direction? James Dyer Ingram Content Group (615) 213-4311 From: Alessandro Benedetti [mailto:benedetti.ale...@gmail.com] Sent: Wednesday, January 15, 2014 11:53 AM To: dev@lucene.apache.org Subject: Re: [Apache Solr] Filter query Suggester and Spellchecker No one? guys ? 2014/1/14 Alessandro Benedetti benedetti.ale...@gmail.commailto:benedetti.ale...@gmail.com Hi guys, this proposal will be for an improvement. I propose to add the chance of suggest terms ( for Spellchecking and Auto Suggest) based only to a subset of Documents. In this way we can provide security implementations that will allow users to see suggestions of terms , only from allowed to see documents. These are the proposed approaches : Filter query Auto Suggest 1) retrieve the suggested tokens from the input text using the already cutting edge FST based suggester 2) use a similar approach of the TermEnum because a) we have a small set of suggestions ( reasonable, because we can filter to 5-10 suggestions max) So the termEnum approach will be fast. b) we can get for each suggested token the posting list and make the intersection with the resulting DocId list ( from the filter query), if null, not return the suggestion. Filter query Spellcheck 1) we can use the already cutting edge FSA based direct index spellchecker and get the suggestions 2) use a similar approach of the TermEnum because a) we have a small set of suggestions ( reasonable, because we can filter to 5-10 suggestions max) So the termEnum approach will be fast. b) we can get for each suggested token the posting list and make the intersection with the resulting DocId list ( from the filter query), if null, not return the suggestion. Of course we will have to add a further parameter in the request handler, something like : spellcheck.qf Let me know your impression and ideas, Cheers -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
RE: have developer question about ClobTransformer and DIH
I think this code snippet attempts to map the schema.xml types to database types. If your database is indeed sending this as a LONGVARCHAR, I would expect a default resultset.getString(index) to correctly get text from a LONGVARCHAR column. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent: Tuesday, May 21, 2013 9:09 AM To: dev@lucene.apache.org Subject: RE: have developer question about ClobTransformer and DIH Since i don't see Types.LONGVARCHAR mentioned anywhere in the DIH code base, i suspect it's falling back to some default behavior assuming String data which doesn't account for the way LONGVARCHAR data is probably returned as an Object that needs to be streamed similar to a Clob. could this be the default behaviour? for (MapString, String map : context.getAllEntityFields()) { String n = map.get(DataImporter.COLUMN); String t = map.get(DataImporter.TYPE); if (sint.equals(t) || integer.equals(t)) fieldNameVsType.put(n, Types.INTEGER); else if (slong.equals(t) || long.equals(t)) fieldNameVsType.put(n, Types.BIGINT); else if (float.equals(t) || sfloat.equals(t)) fieldNameVsType.put(n, Types.FLOAT); else if (double.equals(t) || sdouble.equals(t)) fieldNameVsType.put(n, Types.DOUBLE); else if (date.equals(t)) fieldNameVsType.put(n, Types.DATE); else if (boolean.equals(t)) fieldNameVsType.put(n, Types.BOOLEAN); else if (binary.equals(t)) fieldNameVsType.put(n, Types.BLOB); * else fieldNameVsType.put(n, Types.VARCHAR);* } -- View this message in context: http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064916.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: have developer question about ClobTransformer and DIH
Yes, that is correct. So it is going to do resultSet.getString(zzz) for any type it cannot address with the case statement. This should be fine if your db is returning a LONGVARCHAR. I see in the code also if you specify dataSource convertType=false ... / it will do resultSet.getObject(zzz) on everything. I doubt it, but this might address your problem in the case of a jdbc driver doing something out of the ordinary. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent: Tuesday, May 21, 2013 9:58 AM To: dev@lucene.apache.org Subject: RE: have developer question about ClobTransformer and DIH james, just trying to learn more about the source code, looking at the JdbcDataSource.java, it looks like this is the default behavior of the case statement in method getARow() default: result.put(colName, resultSet.getString(colName)); break; -- View this message in context: http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064934.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: have developer question about ClobTransformer and DIH
Chris, I'm basing my opinion here on this statement, The method ResultSet.getString, which allocates and returns a new String object, is recommended for retrieving data from CHAR, VARCHAR, and LONGVARCHAR fields. from section 9.3.1 of this document: http://docs.oracle.com/javase/1.4.2/docs/guide/jdbc/getstart/mapping.html I realize this is from 1.4.2 but I could not find a newer version of this document. I would not expect (blindly this time) it to have changed in a backwards-incompatible way. In the end, of course, it really depends on however a particular jdbc driver's implementation. If an obscure database's jdbc driver--after the user did a funny workaround to get some other tool to work--is returning bytes or hex addresses or whatever, but if you can solve it with a cast...why would we want to modify our code to make this particular case work more smoothly? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tuesday, May 21, 2013 10:28 AM To: dev@lucene.apache.org Subject: RE: have developer question about ClobTransformer and DIH : If your database is indeed sending this as a LONGVARCHAR, I would expect : a default resultset.getString(index) to correctly get text from a : LONGVARCHAR column. James: how certain is your expecation? Based on the sparse mentions of LONGVARCHAR in the ResultSet class docs, i'm not convinced getString() will do the right thing http://docs.oracle.com/javase/6/docs/api/java/sql/ResultSet.html#getAsciiStream%28int%29 -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: have developer question about ClobTransformer and DIH
I think you're confusing the hierarchy of your database's types with the hierarchy in Java. In Java, a java.sql.Blob and a java.sql.Clob are 2 different things. They do not extend a common ancestor (excpt java.lang.Object). To write code that deals with both means you need to have separate paths for each object type. There is no way around this. (Compare the situation with Integer, Float, BigDecimal, etc, which all extend Number...In this case, your jdbc code can just expect a Number back from the database regardless of what object a particular jdbc driver decided to return to you.) James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent: Friday, May 17, 2013 9:01 PM To: dev@lucene.apache.org Subject: RE: have developer question about ClobTransformer and DIH i still have a disconnect on this (see below) i have been reading on the informix site about BLOB, CLOB and TEXT types. *i miss-stated earlier that a TEXT type is another type of informix blob - after reading the dox - this is not true.* I think what it comes down to is that a Clob is-not-a Blob. the informix docs indicate the opposite. CLOB and BLOB are sub-classes of smart object types. what is a smart object type (the super class for BLOB and CLOB): http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm what is a BLOB type: http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm what is a CLOB type: http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm what is a TEXT type: http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm after reading the above - my disconnect lies with the following: if an informix TEXT type is basically text - then why did solr return the two TEXT fields as binary addresses, when i removed all references to ClobTransformer and the clob=true switches from the fields in the db-config.xml file?? if TEXT is just text, then there should be no need to leverage ClobTransformer and to cast TEXT type fields as CLOBs. see my earlier post on the solr users group for the detail: http://lucene.472066.n3.nabble.com/having-trouble-storing-large-text-blob-fields-returns-binary-address-in-search-results-td4063979.html#a4064260 mark -- View this message in context: http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064323.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1484015 - in /lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext: ./ jcl-over-slf4j-1.6.6.jar jul-to-slf4j-1.6.6.jar log4j-1.2.16.jar slf4j-api-1.6.6.jar slf4j-log4j12-1.6.6.jar
My apologies. I will revert now. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Monday, May 20, 2013 2:48 PM To: Lucene/Solr dev Subject: Re: svn commit: r1484015 - in /lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext: ./ jcl-over-slf4j-1.6.6.jar jul-to-slf4j-1.6.6.jar log4j-1.2.16.jar slf4j-api-1.6.6.jar slf4j-log4j12-1.6.6.jar James, I'm assuming this was a mistake? Can you revert it? Thanks. Mike McCandless http://blog.mikemccandless.com On Sat, May 18, 2013 at 4:41 AM, Uwe Schindler u...@thetaphi.de wrote: What happened here?: - We don't use 4.2 branch anymore - Please don't commit JAR files - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: jd...@apache.org [mailto:jd...@apache.org] Sent: Saturday, May 18, 2013 12:13 AM To: comm...@lucene.apache.org Subject: svn commit: r1484015 - in /lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext: ./ jcl-over- slf4j-1.6.6.jar jul-to-slf4j-1.6.6.jar log4j-1.2.16.jar slf4j-api-1.6.6.jar slf4j- log4j12-1.6.6.jar Author: jdyer Date: Fri May 17 22:13:05 2013 New Revision: 1484015 URL: http://svn.apache.org/r1484015 Log: initial buy Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/ lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jcl-over- slf4j-1.6.6.jar (with props) lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jul-to-slf4j- 1.6.6.jar (with props) lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/log4j- 1.2.16.jar (with props) lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j-api- 1.6.6.jar (with props) lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j-log4j12- 1.6.6.jar (with props) Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jcl- over-slf4j-1.6.6.jar URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e xample/lib/ext/jcl-over-slf4j-1.6.6.jar?rev=1484015view=auto == Binary file - no diff available. Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jul-to- slf4j-1.6.6.jar URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e xample/lib/ext/jul-to-slf4j-1.6.6.jar?rev=1484015view=auto == Binary file - no diff available. Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/log4j- 1.2.16.jar URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e xample/lib/ext/log4j-1.2.16.jar?rev=1484015view=auto == Binary file - no diff available. Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j- api-1.6.6.jar URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e xample/lib/ext/slf4j-api-1.6.6.jar?rev=1484015view=auto == Binary file - no diff available. Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j- log4j12-1.6.6.jar URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e xample/lib/ext/slf4j-log4j12-1.6.6.jar?rev=1484015view=auto == Binary file - no diff available. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: have developer question about ClobTransformer and DIH
I think the usual practice is to use BLOB types to store data that is not a character stream. So you case is probably pretty rare. If casting solves the issue, then why not? I think people use casts all the time to solve these types of compatibility issues. Then again if CLOBTransformer was changed to handle BLOBs also, I do not see the harm. But I would think it would be a much more common case that users would be putting binary-format documents in BLOBs then feeding them to tika or something to extract the text. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent: Friday, May 17, 2013 1:34 PM To: dev@lucene.apache.org Subject: have developer question about ClobTransformer and DIH hello, this is my first post to this forum - if this question is not correct for this forum (or has been addressed in another jira) - just let me know ;) environment: solr 3.5 informix 11.x centos Problem statement: ClobTransformer (./solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/ClobTransformer.java) stopped working when two columns in the table were converted from CLOB to Text. More Detail: recently i ran in to an issue while attempting to use the DIH against an informix table. the DIH and ClobTransfomer were working well with two (2) fields that were defined as CLOB. to resolve another informix specific issue - the two fields were changed to Text fields (another type of informix blob). after the change - another full import was done and it was discovered that these two fields were being returned with the classic hex address that denotes a binary field in the schema. after quite a bit of experimentation and discussion with the DBA's, i cast the two columns as clob. example: cast(att.attr_val AS clob) as attr_val, cast(rsr.rsr_val AS clob) as rsr_val, after doing this - the issue was resolved. Questions: 1) is this a known issue? 2) is this the prescribed remedy for this type of situation - using this version of solr (3.5)? 3) can i get more detail on why the ClobTransfomer does not work with other blob like fields? finally - i looked at the code for ClobTransformer (and Transformer) and was wondering if it is possible to change or add another class that would handle this use case out of the box. thx mark -- View this message in context: http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: have developer question about ClobTransformer and DIH
I think what it comes down to is that a Clob is-not-a Blob. So any code dealing with Clobs that also wants to deal with Blobs and do the same thing with them is going to need to first check the object type returned from the jdbc driver then do separate logic depending on the object type returned. Specifically, if a java.sql.Clob, it needs to call getCharacterStream, but if a java.sql.Blob, getBinaryStream. Possibly there are other gotchas about making assumptions about the binary stream? Then again, if a user uses ClobTransformer on a Blob, then perhaps you can assume all you want about what the binary stream is going to be? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent: Friday, May 17, 2013 4:44 PM To: dev@lucene.apache.org Subject: RE: have developer question about ClobTransformer and DIH Hello James, I think the usual practice is to use BLOB types to store data that is not a character stream. So you case is probably pretty rare admittedly - if the fields had been left as clob fields, then all would have been well. the change to informix Text blobs was driven by the need to use the informix dbload utility, to push data in to the target table before using the DIH to pull data from the target table in to the core. If casting solves the issue, then why not? ok - i will concede this point - but i am interested in why ClobTransformer _needs_ the cast to work in the first place. Then again if CLOBTransformer was changed to handle BLOBs also, I do not see the harm if possible - i would like to understand more about ClobTransformer and what would be needed to make that change. But I would think it would be a much more common case that users would be putting binary-format documents in BLOBs then feeding them to tika or something to extract the text. i am not sure - maybe. at SHC (Sears) the data being stored in these two columns is a large JSON blob. when a query is performed, the JSON blob is parsed and used as needed. thanks again for the discussion and education. mark -- View this message in context: http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064289.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [Discussion] Discontinue the ROLLBACK command in Solr?
We use rollback (in conjunction with DIH) when doing full re-imports on a traditional non-cloud index. As DIH first deletes all documents then adds them all, its handy for it to roll back its changes if something goes wrong. Then the indexing node can simply return to service executing queries until the problem is solved, etc. Would it be acceptable to retain rollback for non-cloud indexes that do not have atomic updates, etc enabled? We could even put an enable rollback parameter in the config that is by default turned off so users can be made to think about it before turning it on, etc. Of course, if rollback was removed, the workaround is to take a backup, then attempt reindex, then replace the backup on failure. This is custom scripting that is currently done automatically. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, May 08, 2013 8:08 PM To: dev@lucene.apache.org Subject: Re: [Discussion] Discontinue the ROLLBACK command in Solr? : Many are confused about the rollback feature in Solr, since it cannot : guarantee a rollback of updates from that client since last commit. : : In my opinion it is pretty useless to have a rollback feature you cannot : rely upon - Unless, that is, you are the only client for sure, having no : autoCommit, and a huge RAMbuffer. : : So why don't we simply deprecate the feature in 4.x and remove it from 5.0? +1 ... i don't remember the details of how/why/where rollback works, but as i understand it, there are some serious caveats to it's usage, as well as some bugs that may not have any viable/simple solutions (at least as far as i know of). example... https://issues.apache.org/jira/browse/SOLR-4733 -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Mini-proposal: Standalone Solr DIH and SolrCell jars
Sameday it would be nice to see DIH be able to run in its own JVM for just the reason Jack mentions. There are quite a few neat things like this that could be done with DIH, but I've tried to work more on improving the tests, fixing bugs, and generally making the code more attractive to developers. I don't think DIH has a chance to really grow up until these types of things get done. I know nothing about solr cell except a few people on the mailing list have been burned trying to run it in production only to learn that it doesn't scale. At least that's the general gist I've heard: for prototyping purposes only. Maybe if it is re-architectured as a stand-alone app it would fare better? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Friday, March 22, 2013 9:07 PM To: dev@lucene.apache.org Subject: Re: Mini-proposal: Standalone Solr DIH and SolrCell jars On 3/22/2013 7:04 PM, Jack Krupansky wrote: I wanted to get some preliminary feedback before filing this proposal as a Jira(s): Package Solr Data Import Handler and Solr Cell as standalone jars with command line interfaces to run as separate processes to promote more efficient distributed processing, both by separating them from the Solr JVM and allowing multiple instances running in parallel on multiple machines. And to make it easier for mere mortals to customize the ingestion code without diving deep into core Solr. That's a really interesting idea. You mentioned having them be grown-up siblings of the SimplePostTool, which would imply that the jar would be directly executable. What would be the mechanism for configuring it and getting DIH status? An alternate idea, if it's feasible, would be that you could drop the jar and its dependencies into a lib directory and embed into an index update application. Hopefully it is only tied to SolrJ, not deep Solr or Lucene internals. I haven't checked. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: 4.1 release notes: please review
Steve, This is pretty longwinded and maybe just the first sentence will suffice with see the wiki for more information. All of this is documented there, more or less. None of this will affect very many people. -- The DataImportHandler contrib module has some minor backwards-compatibility breaks in this release. 1. Both NumberFormatTransformer DateFormatTransformer default to the root locale if none is specified. Prior versions used the JVM default locale. It is strongly advised that users always specify the locale when using this transformer. See https://issues.apache.org/jira/browse/SOLR-4095 2. Both FileDataSoruce and FieldReaderDataSource default to UTF-8 encoding if none is specified. Prior versions used the JVM default. See https://issues.apache.org/jira/browse/SOLR-4096 . Also, the behavior of DataSource and encoding may change again in a subsequent release. See https://issues.apache.org/jira/browse/SOLR-2347 . 3. The formatDate evaluator now defaults to using the root locale. Prior versions used the JVM default. Both the locale timezone now can be specified using new optional parameters. See https://issues.apache.org/jira/browse/SOLR-4086 https://issues.apache.org/jira/browse/SOLR-2201 . 4. The dataimport.properties file, which holds the last indexed timestamp for use with delta imports, is now by default using the root locale. This default can be overridden using the new propertyWriter / tag in data-config.xml. Prior versions used the default JVM locale. This is only of concern if your default locale uses different DataFormatSymbols than the root locale and if your installation depends on these alternate symbols (for instance if your RDMBS takes dates using your locale-specific date symbols). See https://issues.apache.org/jira/browse/SOLR-4051 5. The experimental DIHProperties interface has changed, and is now an abstract class. This will require code changes for anyone who has a custom DIHProperties. Also note that future API changes with this class are possible in subsequent releases. See https://issues.apache.org/jira/browse/SOLR-4051 6. The Evaluator framework has received extensive refactoring. Some custom evaluators may require code changes. Specifically, public or protected methods from the EvaluatorBag class have been moved to the Evaluator abstract class that all Evalutators must extend. See https://issues.apache.org/jira/browse/SOLR-4086 -- James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, January 17, 2013 1:26 AM To: dev@lucene.apache.org Subject: 4.1 release notes: please review I took a crack at the Solr release note. I added CommonTermsQuery to the Lucene release note that Robert has been maintaining - looks good to me otherwise. Please help me whip these into shape. Solr: http://wiki.apache.org/solr/ReleaseNote41 Lucene: http://wiki.apache.org/lucene-java/ReleaseNote41 Thanks, Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: 4.1 release notes: please review
Do you think it is appropriate that we put all of this in a section in the release notes, or something more succinct? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, January 17, 2013 10:36 AM To: dev@lucene.apache.org Subject: Re: 4.1 release notes: please review Hi James, Please go ahead edit the wiki page - I'm sure you'll do a better job of summarizing these than me. Steve On Jan 17, 2013, at 11:31 AM, Dyer, James james.d...@ingramcontent.com wrote: Steve, This is pretty longwinded and maybe just the first sentence will suffice with see the wiki for more information. All of this is documented there, more or less. None of this will affect very many people. -- The DataImportHandler contrib module has some minor backwards-compatibility breaks in this release. 1. Both NumberFormatTransformer DateFormatTransformer default to the root locale if none is specified. Prior versions used the JVM default locale. It is strongly advised that users always specify the locale when using this transformer. See https://issues.apache.org/jira/browse/SOLR-4095 2. Both FileDataSoruce and FieldReaderDataSource default to UTF-8 encoding if none is specified. Prior versions used the JVM default. See https://issues.apache.org/jira/browse/SOLR-4096 . Also, the behavior of DataSource and encoding may change again in a subsequent release. See https://issues.apache.org/jira/browse/SOLR-2347 . 3. The formatDate evaluator now defaults to using the root locale. Prior versions used the JVM default. Both the locale timezone now can be specified using new optional parameters. See https://issues.apache.org/jira/browse/SOLR-4086 https://issues.apache.org/jira/browse/SOLR-2201 . 4. The dataimport.properties file, which holds the last indexed timestamp for use with delta imports, is now by default using the root locale. This default can be overridden using the new propertyWriter / tag in data-config.xml. Prior versions used the default JVM locale. This is only of concern if your default locale uses different DataFormatSymbols than the root locale and if your installation depends on these alternate symbols (for instance if your RDMBS takes dates using your locale-specific date symbols). See https://issues.apache.org/jira/browse/SOLR-4051 5. The experimental DIHProperties interface has changed, and is now an abstract class. This will require code changes for anyone who has a custom DIHProperties. Also note that future API changes with this class are possible in subsequent releases. See https://issues.apache.org/jira/browse/SOLR-4051 6. The Evaluator framework has received extensive refactoring. Some custom evaluators may require code changes. Specifically, public or protected methods from the EvaluatorBag class have been moved to the Evaluator abstract class that all Evalutators must extend. See https://issues.apache.org/jira/browse/SOLR-4086 -- James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, January 17, 2013 1:26 AM To: dev@lucene.apache.org Subject: 4.1 release notes: please review I took a crack at the Solr release note. I added CommonTermsQuery to the Lucene release note that Robert has been maintaining - looks good to me otherwise. Please help me whip these into shape. Solr: http://wiki.apache.org/solr/ReleaseNote41 Lucene: http://wiki.apache.org/lucene-java/ReleaseNote41 Thanks, Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: 4.1 release notes: please review
Ok I have it in the wiki in its own section but I condensed it. Feel free to edit further as you desire. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, January 17, 2013 10:59 AM To: dev@lucene.apache.org Subject: Re: 4.1 release notes: please review My take on release notes is that they mainly talk about new features/big changes, and if other things are mentioned, they are only mentioned briefly. But if you think it's worth its own section, go for it. Steve On Jan 17, 2013, at 11:47 AM, Dyer, James james.d...@ingramcontent.com wrote: Do you think it is appropriate that we put all of this in a section in the release notes, or something more succinct? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, January 17, 2013 10:36 AM To: dev@lucene.apache.org Subject: Re: 4.1 release notes: please review Hi James, Please go ahead edit the wiki page - I'm sure you'll do a better job of summarizing these than me. Steve On Jan 17, 2013, at 11:31 AM, Dyer, James james.d...@ingramcontent.com wrote: Steve, This is pretty longwinded and maybe just the first sentence will suffice with see the wiki for more information. All of this is documented there, more or less. None of this will affect very many people. -- The DataImportHandler contrib module has some minor backwards-compatibility breaks in this release. 1. Both NumberFormatTransformer DateFormatTransformer default to the root locale if none is specified. Prior versions used the JVM default locale. It is strongly advised that users always specify the locale when using this transformer. See https://issues.apache.org/jira/browse/SOLR-4095 2. Both FileDataSoruce and FieldReaderDataSource default to UTF-8 encoding if none is specified. Prior versions used the JVM default. See https://issues.apache.org/jira/browse/SOLR-4096 . Also, the behavior of DataSource and encoding may change again in a subsequent release. See https://issues.apache.org/jira/browse/SOLR-2347 . 3. The formatDate evaluator now defaults to using the root locale. Prior versions used the JVM default. Both the locale timezone now can be specified using new optional parameters. See https://issues.apache.org/jira/browse/SOLR-4086 https://issues.apache.org/jira/browse/SOLR-2201 . 4. The dataimport.properties file, which holds the last indexed timestamp for use with delta imports, is now by default using the root locale. This default can be overridden using the new propertyWriter / tag in data-config.xml. Prior versions used the default JVM locale. This is only of concern if your default locale uses different DataFormatSymbols than the root locale and if your installation depends on these alternate symbols (for instance if your RDMBS takes dates using your locale-specific date symbols). See https://issues.apache.org/jira/browse/SOLR-4051 5. The experimental DIHProperties interface has changed, and is now an abstract class. This will require code changes for anyone who has a custom DIHProperties. Also note that future API changes with this class are possible in subsequent releases. See https://issues.apache.org/jira/browse/SOLR-4051 6. The Evaluator framework has received extensive refactoring. Some custom evaluators may require code changes. Specifically, public or protected methods from the EvaluatorBag class have been moved to the Evaluator abstract class that all Evalutators must extend. See https://issues.apache.org/jira/browse/SOLR-4086 -- James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, January 17, 2013 1:26 AM To: dev@lucene.apache.org Subject: 4.1 release notes: please review I took a crack at the Solr release note. I added CommonTermsQuery to the Lucene release note that Robert has been maintaining - looks good to me otherwise. Please help me whip these into shape. Solr: http://wiki.apache.org/solr/ReleaseNote41 Lucene: http://wiki.apache.org/lucene-java/ReleaseNote41 Thanks, Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Possible bug in Solr SpellCheckComponent if more than one QueryConverter class is present
Jack, Did you test this to see if you could trigger this bug? But in any case, can you open a jira ticket so this won't fall under the radar? Even if the comment that was put here is true I guess we should minimally throw an exception, or use the first one and log a warning, maybe? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Sunday, January 13, 2013 1:24 PM To: Lucene/Solr Dev Subject: Possible bug in Solr SpellCheckComponent if more than one QueryConverter class is present Reading through the code for Solr SpellCheckComponent.java for 4.1, it looks like it neither complains nor defaults reasonably if more than on QueryConverter class is present in the Solr lib directories: MapString, QueryConverter queryConverters = new HashMapString, QueryConverter(); core.initPlugins(queryConverters,QueryConverter.class); //ensure that there is at least one query converter defined if (queryConverters.size() == 0) { LOG.info(No queryConverter defined, using default converter); queryConverters.put(queryConverter, new SpellingQueryConverter()); } //there should only be one if (queryConverters.size() == 1) { queryConverter = queryConverters.values().iterator().next(); IndexSchema schema = core.getSchema(); String fieldTypeName = (String) initParams.get(queryAnalyzerFieldType); FieldType fieldType = schema.getFieldTypes().get(fieldTypeName); Analyzer analyzer = fieldType == null ? new WhitespaceAnalyzer(core.getSolrConfig().luceneMatchVersion) : fieldType.getQueryAnalyzer(); //TODO: There's got to be a better way! Where's Spring when you need it? queryConverter.setAnalyzer(analyzer); } No else! And queryConverter is not initialized, except for that code path where there was zero or one QueryConverter class. -- Jack Krupansky
RE: DIH - Using temporary Config from Request-Parameter, partial broken?
The behavior was changed in 4.0-ALPHA with SOLR-2115. See especially my comment from July 20, 2012. There are 3 important changes here: - you can specify a new data-config.xml filename or location on the request using the config parameter. You do not need to put one in solrconfig.xml, but still may to have a default. - As an alternate to using a data-config.xml file, you can always pass a full configuration on the request using the dataConfig parameter. You used to be able to do that only if in debug mode. - the data-config.xml is always parsed and re-loaded with each import. This makes it unnecessary to issue reload-config every time you want to use a new configuration. I think debug mode used to do this also, but now it always does this. Although I'm probably the one person doing the most work on DIH code currently, I've never used the interactive debug mode. Its sort of documented a little at http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mode . The most important aspect of it is it activates all of that DebugLogger code that is everywhere cluttering up DIH. I think the interactive screens are supposed to take in all of those log messages and do something with them graphically for the user. I don't mean to discourage you but I was kinda hoping if SOLR-4151 was left for dead long enough and people got used to it not being there we could just kill DebugLogger... James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Stefan Matheis [mailto:matheis.ste...@gmail.com] Sent: Friday, January 11, 2013 3:36 PM To: dev@lucene.apache.org Subject: DIH - Using temporary Config from Request-Parameter, partial broken? Hey Guys While working on SOLR-4151 (DIH 'debug' mode missing from 4.x UI) i skimmed through the code and found this one: 129 | if (DataImporter.SHOW_CONF_CMD.equals(command)) { 130 | String dataConfigFile = params.get(config); 131 | String dataConfig = params.get(dataConfig); 132 | if(dataConfigFile != null) { 133 | dataConfig = SolrWriter.getResourceAsString(req.getCore().getResourceLoader().openResource(dataConfigFile)); 134 | } 135 | if(dataConfig==null) { 136 | rsp.add(status, DataImporter.MSG.NO_CONFIG_FOUND); 137 | } else { 138 | // Modify incoming request params to add wt=raw from http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java?view=markup What it *should* do, related to the Description of the Issue (SOLR-2115), is: accept a temporary Config (provided by a Request-Parameter) and use it instead of the defined one .. but, as fair as i understand the code: any provided Config will get overwritten if there is a ConfigFile defined in your solrconfig. There is no check in place, that this fallback should only happen if there was no (temporary) configuration given .. or am i missing something really important but maybe not completely obvious here? Stefan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: DIH - Using temporary Config from Request-Parameter, partial broken?
I don't know of anything that doesn't currently work with reloading the config files. I'm not sure the unit test on it handles the case where both config and dataConfig are specified. I guess I don't know what happens in that case. Maybe that could be the bug? I did see where the response for having no config at all would be better as a 404 not a 200, and I agree with that. Also, I don't want to discourage you from including the debugger if you've already done (most of) the work on the front-end. If the work is done, someone out there will appreciate it. I just didn't imagine this would get fixed so quickly and then if few people complained we could just deep-six the feature. If it survives, then possibly the backend code can be improved, tests can be written, it can be better documented, etc. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Stefan Matheis [mailto:matheis.ste...@gmail.com] Sent: Friday, January 11, 2013 4:21 PM To: dev@lucene.apache.org Subject: Re: DIH - Using temporary Config from Request-Parameter, partial broken? Hey James Thanks for the quick reply! I already read the comments on SOLR-2115, was just not sure if the case was not describe and therefore not really existing or maybe just forgotten - so to confirm, what doesn't work is, having a configfile defined in solrconfig and still overwrite that with a provided configuration by request, right? On Friday, January 11, 2013 at 11:06 PM, Dyer, James wrote: I don't mean to discourage you but I was kinda hoping if SOLR-4151 was left for dead long enough and people got used to it not being there we could just kill DebugLogger... I'm completely fine with that James, no worries :) I never used it my self, i just took it into work while working on other dataimport things at the UI .. if you like to drop that, we could easily revert that port of the work and avoid people start using it (again) because it's there Stefan On Friday, January 11, 2013 at 11:06 PM, Dyer, James wrote: The behavior was changed in 4.0-ALPHA with SOLR-2115. See especially my comment from July 20, 2012. There are 3 important changes here: - you can specify a new data-config.xml filename or location on the request using the config parameter. You do not need to put one in solrconfig.xml, but still may to have a default. - As an alternate to using a data-config.xml file, you can always pass a full configuration on the request using the dataConfig parameter. You used to be able to do that only if in debug mode. - the data-config.xml is always parsed and re-loaded with each import. This makes it unnecessary to issue reload-config every time you want to use a new configuration. I think debug mode used to do this also, but now it always does this. Although I'm probably the one person doing the most work on DIH code currently, I've never used the interactive debug mode. Its sort of documented a little at http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mode . The most important aspect of it is it activates all of that DebugLogger code that is everywhere cluttering up DIH. I think the interactive screens are supposed to take in all of those log messages and do something with them graphically for the user. I don't mean to discourage you but I was kinda hoping if SOLR-4151 was left for dead long enough and people got used to it not being there we could just kill DebugLogger... James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Stefan Matheis [mailto:matheis.ste...@gmail.com] Sent: Friday, January 11, 2013 3:36 PM To: dev@lucene.apache.org (mailto:dev@lucene.apache.org) Subject: DIH - Using temporary Config from Request-Parameter, partial broken? Hey Guys While working on SOLR-4151 (DIH 'debug' mode missing from 4.x UI) i skimmed through the code and found this one: 129 | if (DataImporter.SHOW_CONF_CMD.equals(command)) { 130 | String dataConfigFile = params.get(config); 131 | String dataConfig = params.get(dataConfig); 132 | if(dataConfigFile != null) { 133 | dataConfig = SolrWriter.getResourceAsString(req.getCore().getResourceLoader().openResource(dataConfigFile)); 134 | } 135 | if(dataConfig==null) { 136 | rsp.add(status, DataImporter.MSG.NO_CONFIG_FOUND); 137 | } else { 138 | // Modify incoming request params to add wt=raw from http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java?view=markup What it *should* do, related to the Description of the Issue (SOLR-2115), is: accept a temporary Config (provided by a Request-Parameter) and use it instead of the defined one .. but, as fair as i understand the code: any provided Config will get overwritten if there is a ConfigFile defined in your solrconfig
failure with oal.util.TestMaxFailuresRule
I can faithfully reproduce a failure in Trunk on this test with: -Dtests.seed=3FACDC7EBD23CB80:3D65D783617F94F1 this happens both in Linux and Windows. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: lost entries in trunk/lecene/CHANGES.txt
I do apologize for causing problems. But I do usually merge. However, if it is a trivial change (say, just a small test fix) it is a ton faster to just make the change to both branches instead of a merge. I guess I do not understand why this causes problems with seemingly unrelated code (I can be pretty sure the code involved with LUCENE-4585 is entirely separate than code I've been modifying). Is it really a bad thing to make a trivial change this way? Perhaps the issue is when I do a merge, if I notice directories that have property changes only I omit them. Should I be including these? Often these are seeming random directories and I never quite understand why these are being included. (Maybe its just my ignorance of svn.) Perhaps this is the problem? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Sunday, December 09, 2012 3:59 AM To: dev@lucene.apache.org Subject: RE: lost entries in trunk/lecene/CHANGES.txt Hi, I checked a little bit in the commit logs what was going on. From what I can reconstruct: -James Dyer did not use SVN merging to 4.x, he copied the whole file into the 4.x folder, this explains why the 5.0 changes entries suddenly appeared in the 4.x brach (which I removed yesterday). James seems to never merge his changes between branches, he applies patch several times or just copies files. -The commit where the entries got lost, that Doron restored an hour ago, seems to have copied an older version of the CHANGES.txt file over the newer version in SVN. This should be impossible with SVN, unless you “svn up” your current Working directory and fix the conflicts by telling SVN to use the older modified (“your”) version instead of doing 3-way-merge. One should use 3-way-merge to do this (e.g. with TortoiseSVN or Subclipse or by hand, arrgh ☺). It looks like James created the patch with an older SVN checkout but failed to merge the changes. James: Can you in the future please use “svn merge” (or the corresponding workflow in your GUI) to merge the changes between branches. This merge adds special “properties” to the SVN log, so one can find out which patches were merged between branches. E.g. TortoiseSVN or Subclipse show those in a different color in the commit log which helps immense if you are about to merge some changes. If you need some help with merging correctly, read http://wiki.apache.org/lucene-java/SvnMerge or just ask me. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.dehttp://www.thetaphi.de/ eMail: u...@thetaphi.de From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Sunday, December 09, 2012 10:15 AM To: dev@lucene.apache.org Subject: RE: lost entries in trunk/lecene/CHANGES.txt They were partly (but in a different way also missing in 4.x). I synced the part from version 4.1 down to version 0 with trunk. 3 entries were missing. Trunk now only has 5.0 as additional section, remaining stuff is identical. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.dehttp://www.thetaphi.de/ eMail: u...@thetaphi.demailto:u...@thetaphi.de From: Doron Cohen [mailto:cdor...@gmail.com] Sent: Sunday, December 09, 2012 9:30 AM To: dev@lucene.apache.orgmailto:dev@lucene.apache.org Subject: lost entries in trunk/lecene/CHANGES.txt Hi, seems some entries were lost when committing LUCENE-4585 (Spatial PrefixTree based Strategies). http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/CHANGES.txt?r1=1418005r2=1418006pathrev=1418006view=diff I think I'll just add them back... Doron
RE: lost entries in trunk/lecene/CHANGES.txt
I'm using subclipse with JavaHL 1.7.7. I am unclear is JavaHL keeps its equivenence versioning the same as official svn versions? I do not have an official svn command line installed, do not use tortise or other tools, etc. Reading Uwe's comment that I never merge, I do wonder if its just that I should let the directory property changes merge in also even if I do not understand them. I just don't like to commit stuff that seems unrelated to what I'm doing and I don't understand. This fits because if it appears I never merge, I also always omit seemingly unrelated property changes when committing a merge. I also would like an answer to my question: Is it ok to make parallel changes instead of a merge if its just a trivial change? Follow up question: is it ok to make the same (trivial) change to 2 branches with 1 commit? It really is very slow for me to merge and if the way I've handled trivial changes in the past breaks things for other people, I can change my ways, or just not fix tiny things if time doesn't allow. Especially when I get an unexpected jenkins test failure, I'm usually in the middle of something else and really want to fix jenkins asap but can't always give it a lot of time (getting more coffee, as you might say to do, Robert) while waiting for svn, etc. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, December 10, 2012 9:57 AM To: dev@lucene.apache.org Subject: Re: lost entries in trunk/lecene/CHANGES.txt On Mon, Dec 10, 2012 at 10:53 AM, Dyer, James james.d...@ingramcontent.com wrote: Perhaps the issue is when I do a merge, if I notice directories that have property changes only I omit them. Should I be including these? Often these are seeming random directories and I never quite understand why these are being included. (Maybe its just my ignorance of svn.) Perhaps this is the problem? Are you using svn 1.7? I really recommend this! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: TestSqlEntityProcessorDelta failures on Policeman Jenkins
ahhh. I did not know that Policeman overrides (or that you could override) the run-test-serially setting in DIH's build.xml. This explains everything as placing the file in a private temp directory rather than the default conf dir would solve the issue. I think then if I keep it as it is (dial back the logging, but keep the properties file in its own temp dir) solves this issue. And if indeed dataimport.properties is the only thing that prevents the DIH tests to run in parallel, it should be an easy enough task to fix this for all the tests and then we can have parallel tests for dih. Thanks a bunch to everyone in helping get this cleared up! James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, December 05, 2012 1:31 PM To: dev@lucene.apache.org Subject: Re: TestSqlEntityProcessorDelta failures on Policeman Jenkins : James: How many JVMs does your machine use (you see this at the beginning : when tests start to run)? ... : ok this is the bug. See dih's build.xml: : : !-- the tests have some parallel problems: writability to single copy of : dataimport.properties -- : property name=tests.jvms value=1/ : : The problem is: policeman jenkins server overrides this by setting the -D : ...and i think, in the specific case of TestSqlEntityProcessorDelta (or more specifically: anything extending AbstractSqlEntityProcessorTestCase) it looks like James fixed the bug in the test when he added the code to help log the state of the file... https://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractSqlEntityProcessorTestCase.java?r1=1408873r2=1417058 ...because he has the test create a random dir for the proprtywriter to write the file for each test class. right? -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Active 4.x branches?
Whenever I want to know who owns a piece of code, I just look at the svn history to see who has been modifying it. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: David Smiley (@MITRE.org) [mailto:dsmi...@mitre.org] Sent: Thursday, November 29, 2012 8:49 AM To: dev@lucene.apache.org Subject: Re: Active 4.x branches? Those are good points Yonik. I guess I don't know what to think anymore. Yonik Seeley-4 wrote On Thu, Nov 29, 2012 at 1:24 AM, David Smiley (@MITRE.org) lt; DSMILEY@ gt; wrote: Maybe we should have a roster somewhere of parts of the codebase that have an owner. Taking ownership is a mindset, and is very different from any kind of recognized having ownership. We shouldn't tag areas as owned by someone, as that could discourage others getting involved in that area. It might also encourage deference to the owner, which would also be a bad thing. We sometimes naturally defer to someone with more experience in an area than we have, but it should continue to be on an informal case-by-case basis. It could be useful to people not in the know on who to contact The right contact point is this mailing list. There's already way to much off-list (and off IRC channel) collaboration that goes on IMO. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscribe@.apache For additional commands, e-mail: dev-help@.apache - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Active-4-x-branches-tp4022609p4023246.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!
it works! Hey...can you tell me a little about full metal. What is this one doing that Policeman isn't? (Obviously something. it found my bugs twice this week and the others didn't...) James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Tuesday, November 20, 2012 7:52 PM To: dev@lucene.apache.org Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure! I've moved to using http://fullmetaljenkins.org/ to host the Jenkins service. Hopefully that makes it so you can visit it with your firewall - it may still be detected as a dynamic ip service though. We will see I guess. - Mark On Nov 20, 2012, at 4:36 PM, Dyer, James james.d...@ingramcontent.com wrote: Thanks Mark. I committed a fix for this. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Tuesday, November 20, 2012 3:07 PM To: dev@lucene.apache.org Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure! Hmmm...bummer. I may be able to address that sometime soon. The full trace below: Error Message expected:[२०१२-११-१८ २०:५८] but was:[2012-11-18 20:58] Stacktrace org.junit.ComparisonFailure: expected:[२०१२-११-१८ २०:५८] but was:[2012-11-18 20:58] at __randomizedtesting.SeedInfo.seed([FC935E046E15B4D4:84DDD194732F51FA]:0) at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) at org.apache.solr.handler.dataimport.TestBuiltInEvaluators.testDateFormatEvaluator(TestBuiltInEvaluators.java:127) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39
RE: Fullmetal Jenkins: Solr4X - Build # 80 - Failure!
By any chance is this Jenkins using Java 8 before JDK 8-ea-b65 ? If so, then it might be hitting https://issues.apache.org/jira/browse/DERBY-5958, which, at least according to the comments, only occurs on earlier revisions of JDK 8. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Wednesday, November 21, 2012 8:47 AM To: dev@lucene.apache.org Subject: Re: Fullmetal Jenkins: Solr4X - Build # 80 - Failure! This actually looks like another locale issue: Caused by: java.sql.SQLException: Supplied territory description 'sr__#Latn' is invalid, expecting ln[_CO[_variant]] - Mark On Wed, Nov 21, 2012 at 12:18 AM, nore...@fullmetaljenkins.org wrote: Solr4X - Build # 80 - Failure: Check console output at http://fullmetaljenkins.org/job/Solr4X/80/ to view the results. 1 tests failed. REGRESSION: org.apache.solr.handler.dataimport.TestSimplePropertiesWriter.testSimplePropertiesWriter Error Message: Failed to create database 'memory:derbyDB', see the next exception for details. Stack Trace: java.sql.SQLException: Failed to create database 'memory:derbyDB', see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.init(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection30.init(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection40.init(Unknown Source) at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source) at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source) at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:579) at java.sql.DriverManager.getConnection(DriverManager.java:243) at org.apache.solr.handler.dataimport.AbstractDIHJdbcTestCase.buildDatabase(AbstractDIHJdbcTestCase.java:140) at org.apache.solr.handler.dataimport.AbstractDIHJdbcTestCase.beforeDihJdbcTest(AbstractDIHJdbcTestCase.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at
RE: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!
Mark, Can you tell me which test failed? I still cannot get into Fullmetal Jenkins. Unfortunately my company's firewall blocks it due to Dynamic DNS. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Tuesday, November 20, 2012 2:30 PM To: dev@lucene.apache.org Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure! FYI locale type issue again: org.junit.ComparisonFailure: expected:[२०१२-११-१८ २०:५८] but was:[2012-11-18 20:58] at __randomizedtesting.SeedInfo.seed([FC935E046E15B4D4:84DDD194732F51FA]:0) at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) On Nov 20, 2012, at 2:58 PM, nore...@fullmetaljenkins.homelinux.org wrote: Solr4X - Build # 67 - Failure: Check console output at http://fullmetaljenkins.homelinux.org/job/Solr4X/67/ to view the results. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!
) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:722) On Tue, Nov 20, 2012 at 3:58 PM, Dyer, James james.d...@ingramcontent.com wrote: Mark, Can you tell me which test failed? I still cannot get into Fullmetal Jenkins. Unfortunately my company's firewall blocks it due to Dynamic DNS. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Tuesday, November 20, 2012 2:30 PM To: dev@lucene.apache.org Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure! FYI locale type issue again: org.junit.ComparisonFailure: expected:[२०१२-११-१८ २०:५८] but was:[2012-11-18 20:58] at __randomizedtesting.SeedInfo.seed([FC935E046E15B4D4:84DDD194732F51FA]:0) at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) On Nov 20, 2012, at 2:58 PM, nore...@fullmetaljenkins.homelinux.org wrote: Solr4X - Build # 67 - Failure: Check console output at http://fullmetaljenkins.homelinux.org/job/Solr4X/67/ to view the results. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Fullmetal Jenkins: Solr4X - Build # 28 - Failure!
I noticed this and made a subsequent fix for this test (4x: r1411348 / trunk: r1411334). I'm having difficulty getting to this Jenkins so I'm not sure if this failure is before or after this commit? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, November 19, 2012 12:27 PM To: dev@lucene.apache.org Subject: Re: Fullmetal Jenkins: Solr4X - Build # 28 - Failure! This test uses SimpleDateFormat ctor in several places that implicitly uses the default locale. (and other date handling with explicit default timezone and so on, I guess that might be a separate issue). On Mon, Nov 19, 2012 at 1:20 PM, Mark Miller markrmil...@gmail.commailto:markrmil...@gmail.com wrote: On Nov 19, 2012, at 1:18 PM, nore...@fullmetaljenkins.homelinux.orgmailto:nore...@fullmetaljenkins.homelinux.org wrote: Solr4X - Build # 28 - Failure: Check console output at http://fullmetaljenkins.homelinux.org:8080/job/Solr4X/28/ to view the results. Looks like failure due to local issue: org.junit.ComparisonFailure: expected:[๒๕๕๕-๑๑-๑๙ ๐๐:๐๐] but was:[2012-11-19 00:00] at __randomizedtesting.SeedInfo.seed([52960D776A05F024:97EE7E504322E141]:0) at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) at org.apache.solr.handler.dataimport.TestVariableResolver.testFunctionNamespace1(TestVariableResolver.java:152) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.orgmailto:dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.orgmailto:dev-h...@lucene.apache.org
RE: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #153: POMs out of sync
Thank you! James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Tuesday, November 13, 2012 6:07 AM To: dev@lucene.apache.org Subject: Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #153: POMs out of sync I committed a fix to the Maven configuration: Derby is now a DIH test dependency On Nov 13, 2012, at 5:31 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/153/ 9 tests failed. FAILED: org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta.org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta Error Message: org.apache.derby.jdbc.EmbeddedDriver Stack Trace: java.lang.ClassNotFoundException: org.apache.derby.jdbc.EmbeddedDriver - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2209 - Failure!
My mistake, sorry. I've got these tests set to @Ignore for now, with a better fix to follow soon. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Policeman Jenkins Server [mailto:jenk...@sd-datasolutions.de] Sent: Monday, November 05, 2012 12:41 PM To: dev@lucene.apache.org; jd...@apache.org; mikemcc...@apache.org Subject: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2209 - Failure! Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2209/ Java: 32bit/jdk1.7.0_09 -server -XX:+UseConcMarkSweepGC 2 tests failed. REGRESSION: org.apache.solr.handler.PingRequestHandlerTest.testDisablingServer Error Message: Should have thrown a SolrException because not enabled yet Stack Trace: java.lang.AssertionError: Should have thrown a SolrException because not enabled yet at __randomizedtesting.SeedInfo.seed([5F8D3F3DEBB9E6DE:8D7CAB800C63B692]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.handler.PingRequestHandlerTest.testDisablingServer(PingRequestHandlerTest.java:140) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at
RE: Service Unavailable exceptions not logged
This was done with https://issues.apache.org/jira/browse/SOLR-2124 . The idea is that it is enough to get a 1-line log whenever PingRequestHandler is hit (which will have the response code). There is no need to also log a severe exception with a stack trace as this is not really an error condition. So if you use PingRequestHandler to take nodes out of a load balancer rotation, it won't create huge logs. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com] Sent: Tuesday, October 30, 2012 1:55 PM To: dev@lucene.apache.org Subject: Service Unavailable exceptions not logged Why are service unavailable exceptions not logged? In the SolrException class, these error codes are specifically skipped from logging, but I don't understand why. This is the 'log' method of the SolrException class: public static void log(Logger log, Throwable e) { if (e instanceof SolrException ((SolrException) e).code() == ErrorCode.SERVICE_UNAVAILABLE.code) { return; } String stackTrace = toStr(e); String ignore = doIgnore(e, stackTrace); if (ignore != null) { log.infohttp://log.info(ignore); return; } log.error(stackTrace); } Tomás
RE: Service Unavailable exceptions not logged
Maybe we could just create a second entry in the ErrorCode enum for 503, say, SERVICE_UNAVAILABLE_NOT_LOGGED and change PingRequestHandler to throw exceptions with this new ErrorCode... James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com] Sent: Tuesday, October 30, 2012 2:32 PM To: dev@lucene.apache.org Subject: Re: Service Unavailable exceptions not logged Hmmm I see. The problem I'm having is that with SolrCloud, in the case of no available nodes for a shard the created exception is a 503, and this is something I would like to see logged. Maybe that exception code should be changed? On Tue, Oct 30, 2012 at 4:15 PM, Dyer, James james.d...@ingramcontent.commailto:james.d...@ingramcontent.com wrote: This was done with https://issues.apache.org/jira/browse/SOLR-2124 . The idea is that it is enough to get a 1-line log whenever PingRequestHandler is hit (which will have the response code). There is no need to also log a severe exception with a stack trace as this is not really an error condition. So if you use PingRequestHandler to take nodes out of a load balancer rotation, it won't create huge logs. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.commailto:tomasflo...@gmail.com] Sent: Tuesday, October 30, 2012 1:55 PM To: dev@lucene.apache.orgmailto:dev@lucene.apache.org Subject: Service Unavailable exceptions not logged Why are service unavailable exceptions not logged? In the SolrException class, these error codes are specifically skipped from logging, but I don't understand why. This is the 'log' method of the SolrException class: public static void log(Logger log, Throwable e) { if (e instanceof SolrException ((SolrException) e).code() == ErrorCode.SERVICE_UNAVAILABLE.code) { return; } String stackTrace = toStr(e); String ignore = doIgnore(e, stackTrace); if (ignore != null) { log.infohttp://log.info(ignore); return; } log.error(stackTrace); } Tomás
RE: Service Unavailable exceptions not logged
Possibly better is introduce yet one more overloaded constructor with a boolean that suppresses logging and change PRH to use it. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com] Sent: Tuesday, October 30, 2012 2:32 PM To: dev@lucene.apache.org Subject: Re: Service Unavailable exceptions not logged Hmmm I see. The problem I'm having is that with SolrCloud, in the case of no available nodes for a shard the created exception is a 503, and this is something I would like to see logged. Maybe that exception code should be changed? On Tue, Oct 30, 2012 at 4:15 PM, Dyer, James james.d...@ingramcontent.commailto:james.d...@ingramcontent.com wrote: This was done with https://issues.apache.org/jira/browse/SOLR-2124 . The idea is that it is enough to get a 1-line log whenever PingRequestHandler is hit (which will have the response code). There is no need to also log a severe exception with a stack trace as this is not really an error condition. So if you use PingRequestHandler to take nodes out of a load balancer rotation, it won't create huge logs. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.commailto:tomasflo...@gmail.com] Sent: Tuesday, October 30, 2012 1:55 PM To: dev@lucene.apache.orgmailto:dev@lucene.apache.org Subject: Service Unavailable exceptions not logged Why are service unavailable exceptions not logged? In the SolrException class, these error codes are specifically skipped from logging, but I don't understand why. This is the 'log' method of the SolrException class: public static void log(Logger log, Throwable e) { if (e instanceof SolrException ((SolrException) e).code() == ErrorCode.SERVICE_UNAVAILABLE.code) { return; } String stackTrace = toStr(e); String ignore = doIgnore(e, stackTrace); if (ignore != null) { log.infohttp://log.info(ignore); return; } log.error(stackTrace); } Tomás
RE: [Discuss] Should Solr be an AppServer agnostic WAR or require Jetty?
From my perspective, working for a company that uses Solr on a bunch of apps, I really wish we keep it agnostic. I see the case for documenting that our testing process exclusively uses Jetty 7, we've included it in our distribution, and we recommend it. But I don't see why we need to be naming our parameters JettyThis or JettyThat and telling people they've got to use Jetty. The fact is users often need to use other containers. In my company, we use Jboss 5. That's it. We have a big support contract for it, our server admins know it, etc. If we were forced to use Jetty, then we would grudingly use it, but then our cost of ownership just went up a little. On the other hand, expecting to test every possible container before you can tell people its supported for a standards-compliant java web-app is just crazy. This is like saying that DIH's SQLEntityProcessor is only supported for HSQLDB because that's the one we test against, or that you can't run Lucene on Solaris because Uwe's Jenkins doesn't have a Solaris environment. Perhaps, though there is a middle ground. Beyond telling people what we test and what recommend, maybe we can write a few tests that check for known bugs from popular servlet/j2ee containers. Or even a wiki page that says something like Some containers have this bug which can hurt in these instances. To check if your container is stricken with this problem, try this... But in the end, the advice should be just like what we say when people ask how big a server they need or what to set their java heap to: test thoroughly before going to production. This is reminding me of one of my pet peeves back when we had Endeca: they had 3 supported OS-es. That's it. The fact that Solr could run in any standards-complaint environment was a big plus in my mind. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Friday, July 13, 2012 8:31 AM To: dev@lucene.apache.org Subject: Re: [Discuss] Should Solr be an AppServer agnostic WAR or require Jetty? On Jul 13, 2012, at 9:19 AM, Robert Muir wrote: I know the wiki used to say the release manager should go and manually test alternative containers before releasing: I refuse to do that. Its not the release manager's job. That's insane anyhow :) The RM can't thorougly test each of other containers as a 'step' in the release process at the end of the cycle :) Absurd. I think that basically meant just smoke test, cause it could not mean much more. Not sure how much good in the world that bought you, but I agree it's not the RM's job. We know we have a good experience with exactly one version of one web container - the one we ship. We actually have been pretty public about this over the past couple years - we have just not changed the website. I can find a multitude of quotes from various Lucene/Solr committers talking about how bad an idea it is not to use Jetty due to a variety of issues. You are asking for a poor experience. - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Low pause GC for java 1.6
Bill, As you know, it really depends on the size of your index combined with which features you're using. There is really no substitute for having a good load test and monitoring tool and to run multiple tests while trying different settings. My guess is that you're experiencing full gc's, even with CMS enabled. This means either your tenured (old) generation is too small or you have the -XX:CMSInitiatingOccupancyFraction set too high (it starts the CMS too late and runs out of memory before it can finish). We've found that some of the defaults the JVM picks and/or the general advice out there doesn't apply to an app like Solr, which is just a different kind of animal than the typical web frontend you might run in a J2ee container. Below are the settings i am using as a starting point for our development Solr 4.0 app. These may or may not work for you but at least should give you a basic idea of how 1 other installation is configured. Also, if you're using older grouping patches (I remember you worked on some of these), perhaps you're hitting some of the scalibility problems that were predicted for some of these? I'm pretty sure the GA grouping features in 3.x solved these problems though. Finally, you probably will get better responses on the users list than the dev list. Also, other users might benefit from other answers you get, so perhaps you could cross-post your question. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 # Basic JVM settings. I thing the 3g new generation size is bigger than you'd normally have with a typical web app but for us it makes the old gen fill up slower and have fewer CMS gc's. Minor (parnew) gc's are still fast enough for us, even with a biggish new gen. -XX:MaxNewSize=3000m -XX:NewSize=3000m -Xms20g -Xmx20g -XX:MaxPermSize=256m # These are our CMS settings -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSParallelRemarkEnabled -XX:CMSMaxAbortablePrecleanTime=15000 # trial and error found this to be the sweet spot for out 16-way machines. -XX:ParallelGCThreads=8 # YOu want these so you can see in your logs what is going on. There are some tutorials on the web on how to make sense of verbose garbage collection. There's no problem using these in Production. -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps # Use this on a 64-bit machine unless your jvm is too old to support it (on by default on newer jvms, I think) -XX:+UseCompressedOops # we found these save a little memory -XX:+UseStringCache -XX:+UseCompressedStrings -Original Message- From: Bill Bell [mailto:billnb...@gmail.com] Sent: Saturday, June 30, 2012 8:49 PM To: Bill Bell Cc: dev@lucene.apache.org Subject: Re: Low pause GC for java 1.6 Nothing? Bill Bell Sent from mobile On Jun 29, 2012, at 9:09 PM, Bill Bell billnb...@gmail.com wrote: We are getting large Solr pauses on Java garbage collection in 1.6 Java. We have tried CMS. But we still have. 4 second wait on GC. What works well for Solr when using 16 GB of RAM? I have read lots of articles and now just looking for practical advise and examples. Sent from my Mobile device 720-256-8076 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Help: SOLR-3430 and Build changes
Let me apologize in advance for my (almost) complete ignorance of everything build related: Maven, Ivy, Ant, etc. Sorry! For Solr-3430, I am introducing a dependency to derby.jar, which will be needed only to run DIH tests. So I don't want it included in the Solr .war. It just needs to be in the classpath when junit runs. 1. Where should I put the .jar/licence/notics/sha1 files? 2. How do I modify the build so that it will be in the classpath for running tests only? 3. What do I need to do to get ivy and maven to pick it up? 4. I'll try my best to get the eclipse/intellij setup correct but I'm only able to test eclipse. I really want to get this right so please give advice. Thanks. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: Help: SOLR-3430 and Build changes
I did a little digging on this and I'm not sure relying on JavaDB is such a sure bet. Its a verbatim copy of Derby 10.2 and while bundled in with the jvm, its not in the classpath by default. Also, I have 2 Oracle 1.6 JVMs on my PC and only 1 includes it. Also, while the documentation says it is in the db directory, on my installation its in the javadb directory. It would be tricky at best to reliably get this in the tester's classpath, I think. It would be safer I think to just include the jar. My thoughts were to eventually migrate the example to use derby instead of hsqldb. Maybe I should either change my test to use hsqldb or change the example to use derby. Then as Robert points out, its just a minor build modification to use the jar from the example. In any case, the current Mock datasource doesn't emulate a real JDBC driver very well and I found it was extremely simple to use Derby in in-memory embedded mode (All you do is issue DriverManager#getConnection with the correct string). There are no config files, etc. I don't know if you want to call this a unit test or an integration test (and what are all those other Solr tests that use Jetty, etc?). In the end, I just want readable tests that are true to real life, which DIH lacks right now. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Wednesday, May 02, 2012 2:16 PM To: dev@lucene.apache.org Subject: RE: Help: SOLR-3430 and Build changes I have not checked this, but if the JavaDB is in the JDK official JavaDocs and is therefore part of JDK6 spec? We have to check this, but *if* the package names start with java.db or whatever it *has* to be also in alternate JDK impls. At least OpenJDK also downloads derby while building. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Wednesday, May 02, 2012 8:42 PM To: dev@lucene.apache.org Subject: Re: Help: SOLR-3430 and Build changes On Wed, May 2, 2012 at 1:51 PM, Uwe Schindler u...@thetaphi.de wrote: One note: Derby is included since JDK 6 as JavaDB together with the JDK: http://www.oracle.com/technetwork/java/javadb/overview/index.html As Lucene/Solr 4 will be using JDK 6 as minimum requirement (in contrast) to Solr 3.x (which was JDK 5), can we not simply rely on this version shipped with JDK? That would make life easy. And for simple tests that version should be enough... But we dont require *oracle*s implementation as a minimum requirement. we also support IBM etc too? -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
How do I document bugfixes applied to 3.6 branch?
I would like to commit SOLR-3361 to 3.6 in case there is a 3.6.1. Should I start a new 3.6.1 section in changes.txt? Does it just go under 4.0 or 3.0 with a note that it is in the 3.6 branch also (but not released)? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2314 - Failure
This is a test bug. I am committing a fix... James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] Sent: Monday, April 23, 2012 2:09 PM To: dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2314 - Failure Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2314/ 1 tests failed. REGRESSION: org.apache.solr.handler.TestReplicationHandler.test Error Message: Backup success not detected:?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime1/int/lstlst name=detailsstr name=indexSize21,58 KB/strstr name=indexPath/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk-java7/checkout/solr/build/solr-core/test/S0/org.apache.solr.handler.TestReplicationHandler$SolrInstance-1335207994266/master/data/index/strarr name=commitslstlong name=indexVersion1335208021601/longlong name=generation19/longarr name=fileliststr_b.per/strstr_b_0.frq/strstr_b_nrm.cfe/strstr_b.fnm/strstr_b.fdt/strstr_b_nrm.cfs/strstrsegments_j/strstr_b_0.tim/strstr_b.fdx/strstr_b_0.tip/str/arr/lst/arrstr name=isMastertrue/strstr name=isSlavefalse/strlong name=indexVersion1335208021601/longlong name=generation19/longlst name=masterstr name=confFilesschema-replication2.xml:schema.xml/strarr name=replicateAfterstrcommit/str/arrstr name=replicationEnabledtrue/strlong name=replicatableGeneration19/long/lst/lststr name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Stack Trace: java.lang.AssertionError: Backup success not detected:?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime1/int/lstlst name=detailsstr name=indexSize21,58 KB/strstr name=indexPath/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk-java7/checkout/solr/build/solr-core/test/S0/org.apache.solr.handler.TestReplicationHandler$SolrInstance-1335207994266/master/data/index/strarr name=commitslstlong name=indexVersion1335208021601/longlong name=generation19/longarr name=fileliststr_b.per/strstr_b_0.frq/strstr_b_nrm.cfe/strstr_b.fnm/strstr_b.fdt/strstr_b_nrm.cfs/strstrsegments_j/strstr_b_0.tim/strstr_b.fdx/strstr_b_0.tip/str/arr/lst/arrstr name=isMastertrue/strstr name=isSlavefalse/strlong name=indexVersion1335208021601/longlong name=generation19/longlst name=masterstr name=confFilesschema-replication2.xml:schema.xml/strarr name=replicateAfterstrcommit/str/arrstr name=replicationEnabledtrue/strlong name=replicatableGeneration19/long/lst/lststr name=WARNINGThis response format is experimental. It is likely to change in the future./str /response at __randomizedtesting.SeedInfo.seed([A93FC569246BDF7A:216BFAB38A97B282]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.handler.TestReplicationHandler.doTestBackup(TestReplicationHandler.java:895) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1913) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:131) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:805) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:866) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:880) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:760) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:682) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:615) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:654) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:812) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:131) at
bad jetty jars for 3.x?
Whenever I try to run tests for 3.x I am getting problems with the jetty jars for the solr example. Before the checksums were added I was getting an error reading the jar. Now I get a bad checksum error. [licenses] CHECKSUM FAILED for ... solr\example\lib\jetty-6.1.26-patched-JETTY-1340.jar (expected: baa65a6f9940f2977fa152221522c0fce84d8c92 was: d446a42a8399e30a8c6e8cfbfb135a6111ea689c) [licenses] CHECKSUM FAILED for ... solr\example\lib\jetty-util-6.1.26-patched-JETTY-1340.jar (expected: 1cd718806c8f0baa318ea4a9c3a5e2f82e27f0e6 was: 186e4c23c58c0eb51342aec9cec92679d70f6c0c) Any ideas what I can do? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #449: POMs out of sync
SOLR-3011 (3.x only bug fixes for DIH Threading) included some more-intense multi-threaded unit tests. I fear that the bugs were not fully solved but occur on certain platforms/environments. This is the second time this test has failed this week, both times during the Maven build. I'd imagine the Maven tests run just so to trigger whatever is happening. This feature is removed in Trunk, and the best solution might be to dial back the tests a little and increase the severity of the warning on the wiki about using threads with DIH. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Wednesday, April 04, 2012 2:46 PM To: dev@lucene.apache.org Subject: RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #449: POMs out of sync This doesn't reproduce for me locally under Ant or under Maven. - Steve -Original Message- From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] Sent: Wednesday, April 04, 2012 3:13 PM To: dev@lucene.apache.org Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #449: POMs out of sync Build: https://builds.apache.org/job/Lucene-Solr-Maven-3.x/449/ 1 tests failed. REGRESSION: org.apache.solr.handler.dataimport.TestThreaded.testCachedThread_FullImport Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:409) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:376) at org.apache.solr.handler.dataimport.TestThreaded.verify(TestThreaded.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:630) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:536) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:457) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:508) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:146) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:36) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
RE: bad jetty jars for 3.x?
Looked a little more into it and found the issue is that http://cloud.github.com/downloads/rmuir; is blocked on the network I'm on. Is hosting these on Robert's github space a permanent arrangement? If so, I imagine I can get the jars from an old checkout and manually put them in the ivy repository? Of course if the whole build didn't fail because the example couldn't be compiled that would be a nice plus. For instance, I've been just trying to run the DIH tests, so I don't see why I need to care if example will compile. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, April 04, 2012 3:32 PM To: dev@lucene.apache.org Subject: Re: bad jetty jars for 3.x? : Whenever I try to run tests for 3.x I am getting problems with the jetty : jars for the solr example. Before the checksums were added I was : getting an error reading the jar. Now I get a bad checksum error. sounds like it was corrupted when downloading? try ant clean-jars and if that doesn't work then try removing it from your ivy cache and do ant clean-jars again (we probably need to add info about this to the HowToContribute page) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #443: POMs out of sync
I tried this seed on my 4-core Windows machine several times but no failure. This test failure might indicate that the DIH threading bugs aren't really fixed in 3.6. On the other hand, users of DIH threads on 3.6 will get a deprecation warning, the wiki discourages it and the feature is gone in 4.0. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] Sent: Saturday, March 31, 2012 8:45 AM To: dev@lucene.apache.org Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #443: POMs out of sync Build: https://builds.apache.org/job/Lucene-Solr-Maven-3.x/443/ 1 tests failed. REGRESSION: org.apache.solr.handler.dataimport.TestThreaded.testCachedThread_FullImport Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:409) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:376) at org.apache.solr.handler.dataimport.TestThreaded.verify(TestThreaded.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:630) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:536) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:457) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:508) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:146) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:36) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110) at
RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync
I'm looking at it. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] Sent: Wednesday, March 21, 2012 3:35 PM To: dev@lucene.apache.org Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/432/ 3 tests failed. FAILED: org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript Error Message: Cannot load Script Engine for language: JavaScript Stack Trace: org.apache.solr.handler.dataimport.DataImportHandlerException: Cannot load Script Engine for language: JavaScript at org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:76) at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:53) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:192) at org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript(TestScriptTransformer.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at
RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync
This is confusing me because the non-Maven build with this commit (#12834 on jenkins) passed. So this JVM has the Rhino JavaScript engine. I guess the Maven build (for Trunk) is using a different 1.6 JRE than the non-Maven build? One without Rhino? Is there any way to use the same JRE? In any case, let me add the ignore back in for this one exception. Its unlikely to mask a real problem and it will let people who have non-rhino-equipped 1.6 JVMs to have the tests pass. I'll commit this shortly. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Dyer, James [mailto:james.d...@ingrambook.com] Sent: Wednesday, March 21, 2012 3:40 PM To: dev@lucene.apache.org Subject: RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync I'm looking at it. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] Sent: Wednesday, March 21, 2012 3:35 PM To: dev@lucene.apache.org Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/432/ 3 tests failed. FAILED: org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript Error Message: Cannot load Script Engine for language: JavaScript Stack Trace: org.apache.solr.handler.dataimport.DataImportHandlerException: Cannot load Script Engine for language: JavaScript at org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:76) at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:53) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:192) at org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript(TestScriptTransformer.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39
RE: I have my name on the staging site
got it. Also I updated a few links in the wiki, and put an obsolete notice (with a link) on the page about editing Forrest. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] (wana update the instructions too?) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
I have my name on the staging site
I got my name on the staging site, but the instructions still have TBD for publishing to the real site. Help? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: Welcome James Dyer
A couple years ago I was told we were going to be adding tons of new docs to Search and we needed to do so low-cost. Only problem, our search vendor licensed their product by the document, making the low-cost goal impossible. I had seen Solr mentioned in the footnote of a book somewhere and thought maybe it was worth looking into. What I didn't realize is that switching to Solr would mean better performance, cheaper hardware, easier configuration and a lot more flexibility. Better yet, it turned out we wouldn't lose functionality with the switch. I just needed to apply a few patches. But then there were just a few little things I couldn't find a patch for, so I subscribed to the dev-list and started doing what I could. All I can say is working on open source is so much better than calling the vendor and having them say nope, it can't do that. But we can file a feature request for you. The work you all do on this project is truly amazing. Thank you for letting me have a little bigger part in it as well. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Grant Ingersoll [mailto:gsing...@apache.org] Sent: Friday, February 10, 2012 7:08 AM To: dev@lucene.apache.org Subject: Welcome James Dyer I'm pleased to announce the PMC has elected James Dyer to be a committer on the project and he has agreed to join. Welcome aboard, James! -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
SolrCore.java imports org.eclipse.jdt.core.dom.ThisExpression
I'm wondering if the import for org.eclipse.jdt.core.dom.ThisExpression in SolrCore.java introduced in r1196797 (SOLR-2861) was a mistake. It adds an additional .jar dependency and doesn't seem to be used. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: [VOTE] Release Lucene/Solr 3.2.0
Michael, A while ago I submitted SOLR-2462 with a patch to fix a critical bug in Solr's spellchecker. I'm not sure if the patch included is the best approach to fixing the problem but I do think any next release should include a fix for this. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Friday, May 27, 2011 10:50 AM To: dev@lucene.apache.org Dev Subject: [VOTE] Release Lucene/Solr 3.2.0 Please vote to release the artifacts at: http://people.apache.org/~mikemccand/lucene_solr_320/rc1/ as Lucene 3.2.0 and Solr 3.2.0. Mike http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Need some DIH Entity Processor development advice...
We have a situation where we have data coming from several long-running queries hitting multiple relational databases. Other data comes in fixed-width text file feeds, etc. All of this has to be joined and denormalized and made into nice SOLR documents. I've been wanting to use DIH as it seems to already provide 90% of what we need. The rest can some in the form of custom transformers Entity Processors that I can write... One big need is to have disk-backed caches. For instance, a child entity that pulls back millions of rows will beat up the db using a regular SQLEntityProcessor whereas the CachedSQLEntityProcessor puts everything in memory in a HashMap so it will only scale to a point. For fixed-width text files, there doesn't seem to be any Cached implementations at all. So I've written a custom Entity Processor that creates a temporary Lucene index to use as a disk cache. Initial tests are promising but with one little problem. I need a place to close the Lucene index reader and then delete the temporary index. It seemed easy enough to override the destroy() method from EntityProcessorBase. But to my surprise, it seems that both destroy() and init() get called every time a new Primary Key is called up from the cache. (see DocBuilder.buildDocument()). Just to be sure I wasn't crazy, I added a destroy() method to CachedSqlEntityProcessor and found it indeed gets called every time a new Primary Key is called from the cache. In fact, the first couple of lines in cacheInit() in EntityProcessorBase seem to be there to cope with the fact that both destroy() and init() get called over and over again during the lifecycle of the object. I've also noticed that destroy() isn't actually implemented anywhere in the prepacked Entity Processors. This makes me wonder if it is a mistake. Should DocBuilder be changed to call destroy() only once per lifecycle for each EntityProcessor object? If so I think I can have a patch in JIRA in short order. Otherwise...How do I best accomplish my clean-up tasks? Advice is greatly appreciated. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311
RE: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality
Grant, I saw your comment and I agree its probably best to somehow re-query through a Search Handler, either the existing one with all other components turned off, or through a new one just for this purpose. If you (or someone else) are not able to work on implementing it this way then I can probably get a little time in a few weeks. James Dyer E-Commerce Systems Ingram Book Company (615) 213-4311 -Original Message- From: Grant Ingersoll [mailto:gsing...@apache.org] Sent: Friday, August 13, 2010 7:34 AM To: dev@lucene.apache.org Subject: Re: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality Hi James, Did you see my comments on the issue? On Aug 11, 2010, at 12:28 AM, Dyer, James wrote: Tom, I'm going to also need this to work with 1.4.1 within the next month or two so if someone else doesn't back-port it to 1.4.1 then I probably will. I also would like to see this working with shards. The PossibilityIterator class likely can be made a lot simpler. If nobody else takes care of these items I will try to find time to do so myself prior to making it work with 1.4.1. James Dyer E-Commerce Systems Ingram Book Company (615) 213-4311 -Original Message- From: Tom Phethean (JIRA) [mailto:j...@apache.org] Sent: Tuesday, August 10, 2010 10:01 AM To: dev@lucene.apache.org Subject: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality [ https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira. plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896903# action_12896903 ] Tom Phethean commented on SOLR-2010: Ok, thanks. Do you know if there is a rough timescale on that? Improvements to SpellCheckComponent Collate functionality - Key: SOLR-2010 URL: https://issues.apache.org/jira/browse/SOLR-2010 Project: Solr Issue Type: New Feature Components: clients - java, spellchecker Affects Versions: 1.4.1 Environment: Tested against trunk revision 966633 Reporter: James Dyer Assignee: Grant Ingersoll Priority: Minor Attachments: SOLR-2010.patch, SOLR-2010.patch Improvements to SpellCheckComponent Collate functionality Our project requires a better Spell Check Collator. I'm contributing this as a patch to get suggestions for improvements and in case there is a broader need for these features. 1. Only return collations that are guaranteed to result in hits if re-queried (applying original fq params also). This is especially helpful when there is more than one correction per query. The 1.4 behavior does not verify that a particular combination will actually return hits. 2. Provide the option to get multiple collation suggestions 3. Provide extended collation results including the # of hits re-querying will return and a breakdown of each misspelled word and its correction. This patch is similar to what is described in SOLR-507 item #1. Also, this patch provides a viable workaround for the problem discussed in SOLR-1074. A dictionary could be created that combines the terms from the multiple fields. The collator then would prune out any spurious suggestions this would cause. This patch adds the following spellcheck parameters: 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try before giving up. Lower values ensure better performance. Higher values may be necessary to find a collation that can return results. Default is 0, which maintains backwards-compatible behavior (do not check collations). 2. spellcheck.maxCollations - maximum # of collations to return. Default is 1, which maintains backwards-compatible behavior. 3. spellcheck.collateExtendedResult - if true, returns an expanded response format detailing collations found. default is false, which maintains backwards-compatible behavior. When true, output is like this (in context): lst name=spellcheck lst name=suggestions lst name=hopq int name=numFound94/int int name=startOffset7/int int name=endOffset11/int arr name=suggestion strhope/str strhow/str strhope/str strchops/str strhoped/str etc /arr lst name=faill int name=numFound100/int int name=startOffset16/int int name=endOffset21/int arr name=suggestion strfall/str strfails/str strfail/str
RE: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality
Tom, I'm going to also need this to work with 1.4.1 within the next month or two so if someone else doesn't back-port it to 1.4.1 then I probably will. I also would like to see this working with shards. The PossibilityIterator class likely can be made a lot simpler. If nobody else takes care of these items I will try to find time to do so myself prior to making it work with 1.4.1. James Dyer E-Commerce Systems Ingram Book Company (615) 213-4311 -Original Message- From: Tom Phethean (JIRA) [mailto:j...@apache.org] Sent: Tuesday, August 10, 2010 10:01 AM To: dev@lucene.apache.org Subject: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality [ https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896903#action_12896903 ] Tom Phethean commented on SOLR-2010: Ok, thanks. Do you know if there is a rough timescale on that? Improvements to SpellCheckComponent Collate functionality - Key: SOLR-2010 URL: https://issues.apache.org/jira/browse/SOLR-2010 Project: Solr Issue Type: New Feature Components: clients - java, spellchecker Affects Versions: 1.4.1 Environment: Tested against trunk revision 966633 Reporter: James Dyer Assignee: Grant Ingersoll Priority: Minor Attachments: SOLR-2010.patch, SOLR-2010.patch Improvements to SpellCheckComponent Collate functionality Our project requires a better Spell Check Collator. I'm contributing this as a patch to get suggestions for improvements and in case there is a broader need for these features. 1. Only return collations that are guaranteed to result in hits if re-queried (applying original fq params also). This is especially helpful when there is more than one correction per query. The 1.4 behavior does not verify that a particular combination will actually return hits. 2. Provide the option to get multiple collation suggestions 3. Provide extended collation results including the # of hits re-querying will return and a breakdown of each misspelled word and its correction. This patch is similar to what is described in SOLR-507 item #1. Also, this patch provides a viable workaround for the problem discussed in SOLR-1074. A dictionary could be created that combines the terms from the multiple fields. The collator then would prune out any spurious suggestions this would cause. This patch adds the following spellcheck parameters: 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try before giving up. Lower values ensure better performance. Higher values may be necessary to find a collation that can return results. Default is 0, which maintains backwards-compatible behavior (do not check collations). 2. spellcheck.maxCollations - maximum # of collations to return. Default is 1, which maintains backwards-compatible behavior. 3. spellcheck.collateExtendedResult - if true, returns an expanded response format detailing collations found. default is false, which maintains backwards-compatible behavior. When true, output is like this (in context): lst name=spellcheck lst name=suggestions lst name=hopq int name=numFound94/int int name=startOffset7/int int name=endOffset11/int arr name=suggestion strhope/str strhow/str strhope/str strchops/str strhoped/str etc /arr lst name=faill int name=numFound100/int int name=startOffset16/int int name=endOffset21/int arr name=suggestion strfall/str strfails/str strfail/str strfill/str strfaith/str strall/str etc /arr /lst lst name=collation str name=collationQueryTitle:(how AND fails)/str int name=hits2/int lst name=misspellingsAndCorrections str name=hopqhow/str str name=faillfails/str /lst /lst lst name=collation str name=collationQueryTitle:(hope AND faith)/str int name=hits2/int