from:"Dyer, James"

Streaming Expressions and R

2017-06-01 Thread Dyer, James

As Solr's Streaming Expression support grows more statistical and data analysis 
functionality, I thought it would be useful to have first-class support in R.

To this end, I've begun an R package for this here: 
https://github.com/jdyer1/R-solr-stream

At this point, this new package allows users to execute a streaming expression 
and get it into an R data.frame.  Likewise an R object can be streamed to solr. 
 This has obvious overlaps with the existing "rsolr" package.  However, the 
existing package does not, best I can tell, support streaming expressions.  
Also, we already can have support via the jdbc driver.

My question is whether or not an effort along these lines is worthwhile, and if 
so, what future direction it should take.  I appreciate any feedback.

James Dyer
Ingram Content Group

RE: [JENKINS-EA] Lucene-Solr-6.x-Linux (32bit/jdk-9-ea+168) - Build # 3502 - Still Failing!

2017-05-12 Thread Dyer, James

I committed the updated hsqldb jar today, along with its sha1.  Pre-commit 
passes for me.  Does anyone know why Jenkins is complaining and if there is 
anything I must do to fix this?

James Dyer
Ingram Content Group

-Original Message-
From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de] 
Sent: Friday, May 12, 2017 1:33 PM
To: dev@lucene.apache.org
Subject: [JENKINS-EA] Lucene-Solr-6.x-Linux (32bit/jdk-9-ea+168) - Build # 3502 
- Still Failing!
Importance: Low

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/3502/
Java: 32bit/jdk-9-ea+168 -client -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 49636 lines...]
BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:775: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:655: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:643: Source checkout is 
modified!!! Offending files:
* solr/licenses/hsqldb-2.4.0.jar.sha1

Total time: 63 minutes 5 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: JDBCStream and loading drivers

2017-04-26 Thread Dyer, James

Thank you for the quick replies.  I can see how it would be powerful to be able 
to execute streaming expressions outside of solr, giving yourself the option of 
moving some of the work to the client.  I wouldn't necessarily tie it into core 
because being able to join a solr stream with a rdbms result -- either within 
solr, or in your driver program -- that could be a nice set of options to have. 
 But the patch on SOLR-1015 seems to get this right in (it seems from a quick 
look) that it uses the core's classloader when it is available, and falls back 
when it is not.  It might be nice -- especially as the streaming code base 
grows -- to consider packaging it separately from the solrj client itself.

Along these lines:  I was initially confused by the examples in 
https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions in that 
the cURL example at the top is materially different from the SolrJ example 
following it.  That is, with the cURL example, all of the work occurs in Solr 
and only the final result is streamed back.  With the SolrJ example, some of 
that work is now being done in the client.  This is easy to discover if you try 
the JDBC expression:  following the cURL example, the query originates in Solr 
; on the SolrJ example, the query originates on the client -- the server has no 
involvement at all.

Is my understanding here correct?  I can see how this design has great 
advantage as it gives us the ability to write driver programs that use the solr 
cores as worker nodes.  But this wasn't immediately clear to me.  I also 
wonder:  do we have an (easy) way with SolrJ currently to simply execute a 
(chain of) streaming expression(s) and get the result back, like in the cURL 
example (besides using JDBC)?

James Dyer
Ingram Content Group

From: Joel Bernstein [mailto:joels...@gmail.com]
Sent: Tuesday, April 25, 2017 6:25 PM
To: lucene dev 
Subject: Re: JDBCStream and loading drivers

There are a few stream impl's that have access to SolrCore (ClassifyStream, 
AnalyzeEvaluator) because they use analyzers. These classes have been added to 
core. We could move the JdbcStream to core as well if it makes the user 
experience nicer.

Originally the idea was that you could run the Streaming API Java classes like 
you would other Solrj clients. I think over time this may become important 
again, as I believe there is work underway for spinning up worker nodes that 
are not attached to a SolrCore.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Apr 25, 2017 at 3:25 PM, Dyer, James 
mailto:james.d...@ingramcontent.com>> wrote:
Using JDBCStream, Solr cannot find my database driver if I put the .jar in the 
shared lib directory ($SOLR_HOME/lib).  In order for the classloader to find 
it, the driver has to be in the server's lib directory.  Looking at why, I see 
that to get the full classpath, including what is in the shared lib directory, 
we'd typically get a reference to a SolrCore, call "getResourceLoader" and then 
"findClass".  This makes use of the URLClassLoader that knows about the shared 
lib.

But fixing JDBCStream to do this might not be so easy?  Best I can tell, 
Streaming Expressions are written nearly stand-alone as client code that merely 
executes in the Solr JVM.  Is this correct?  Indeed, the code itself is 
included with the client, in the SolrJ package, despite it mostly being 
server-side code … Maybe I misunderstand?

On the one hand, it isn't a huge deal as to where you need to put your drivers 
to make this work.  But on the other hand, it isn't really the best user 
experience, in my opinion at least, to have to dig around the server 
directories to find where your driver needs to go.  And also, if this is truly 
server-side code, why do we ship it with the client jar?  Unless there is a 
desire to make a stand-alone Streaming Expression engine that interacts with 
Solr as a client, would it be acceptable to somehow expose the SolrCore to it 
for loading resources like this?

James Dyer
Ingram Content Group

JDBCStream and loading drivers

2017-04-25 Thread Dyer, James

Using JDBCStream, Solr cannot find my database driver if I put the .jar in the 
shared lib directory ($SOLR_HOME/lib).  In order for the classloader to find 
it, the driver has to be in the server's lib directory.  Looking at why, I see 
that to get the full classpath, including what is in the shared lib directory, 
we'd typically get a reference to a SolrCore, call "getResourceLoader" and then 
"findClass".  This makes use of the URLClassLoader that knows about the shared 
lib.

But fixing JDBCStream to do this might not be so easy?  Best I can tell, 
Streaming Expressions are written nearly stand-alone as client code that merely 
executes in the Solr JVM.  Is this correct?  Indeed, the code itself is 
included with the client, in the SolrJ package, despite it mostly being 
server-side code ... Maybe I misunderstand?

On the one hand, it isn't a huge deal as to where you need to put your drivers 
to make this work.  But on the other hand, it isn't really the best user 
experience, in my opinion at least, to have to dig around the server 
directories to find where your driver needs to go.  And also, if this is truly 
server-side code, why do we ship it with the client jar?  Unless there is a 
desire to make a stand-alone Streaming Expression engine that interacts with 
Solr as a client, would it be acceptable to somehow expose the SolrCore to it 
for loading resources like this?

James Dyer
Ingram Content Group

RE: Speculating about the removal of the standalone Solr mode

2016-03-11 Thread Dyer, James

I would think it unfortunate if this ever happens.  Solr in non-cloud mode is 
simple, easy-to-understand, has few moving parts.  Many installations do not 
need to shard, have real-time-updates, etc.  Using the replication handler in 
"legacy mode" works great for us.  The config files are on the filesystem.  You 
need not learn a cli to interact with zookeeper, etc.  I would be scared to 
death running cloud mode in Production if I didn't first obtain an in-depth 
understanding of zookeeper internals.

I can see if there is a huge burden imposed here and if almost all use-cases 
require cloud.  But as for "api consolidation", there are few api's you need to 
learn if running non-cloud.  So what stops us from focusing apis on the need of 
cloud installations?  And the documentation for non-cloud ought to be simple to 
maintain, there's so much less to learn and know.

For those of you that work as consultants or for support providers, it may seem 
that everyone is running cloud mode.  But my guess is those who run cloud mode 
are the ones that cannot get by without your services.

James Dyer
Ingram Content Group

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Wednesday, March 09, 2016 11:34 AM
To: dev@lucene.apache.org
Subject: Speculating about the removal of the standalone Solr mode

I've been thinking about the fact that standalone and cloud modes in
Solr are very different.

The writing on the wall suggests that Solr will eventually (probably 7.0
minimum) eliminate the standalone mode and always operate with
zookeeper.  A "standalone" node would in fact be a single-node cloud
running the embedded zookeeper.

Once zk-as-truth becomes a reality, I can see a few advantages to always
running in cloud mode.  The documentation can include one way to
accomplish basic tasks.  The CoreAdmin API can be eliminated, and any
required functionality fully merged into the Collections API. 
CloudSolrClient will work for all installations.  A script that works
for cloud mode will also work for standalone mode, because that's just a
smaller cloud.

I was planning to open an issue to discuss and implement this.  If
that's not a good idea, please let me know.

None of my main Solr installations are running in cloud mode, so the
removal of standalone mode will be an inconvenience for me, but I still
think it's the right thing to do in the long term.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Lucene/Solr git mirror will soon turn off

2015-12-04 Thread Dyer, James

I know Infra has tried a number of things to resolve this, to no avail.  But 
did we try "git-svn --revision=" to only mirror "post-LUCENE-3930" (ivy, 
r1307099)?  Or if that's not lean enough for the git-svn mirror to work, then 
cut off when 4.x was branched or whenever.  The hope would be to give git users 
enough of the past that it would be useful for new development but then also we 
can retain the status quo with svn (which is the best path for a 26-day 
timeframe).

James Dyer
Ingram Content Group


-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com] 
Sent: Friday, December 04, 2015 2:58 PM
To: Lucene/Solr dev
Cc: infrastruct...@apache.org
Subject: Lucene/Solr git mirror will soon turn off

Hello devs,

The infra team has notified us (Lucene/Solr) that in 26 days our
git-svn mirror will be turned off, because running it consumes too
many system resources, affecting other projects, apparently because of
a memory leak in git-svn.

Does anyone know of a link to this git-svn issue?  Is it a known
issue?  If there's something simple we can do (remove old jars from
our svn history, remove old branches), maybe we can sidestep the issue
and infra will allow it to keep running?

Or maybe someone in the Lucene/Solr dev community with prior
experience with git-svn could volunteer to play with it to see if
there's a viable solution, maybe with command-line options e.g. to
only mirror specific branches (trunk, 5.x)?

Or maybe it's time for us to switch to git, but there are problems
there too, e.g. we are currently missing large parts of our svn
history from the mirror now and it's not clear whether that would be
fixed if we switched:
https://issues.apache.org/jira/browse/INFRA-10828  Also, because we
used to add JAR files to svn, the "git clone" would likely take
several GBs unless we remove those JARs from our history.

Or if anyone has any other ideas, we should explore them, because
otherwise in 26 days there will be no more updates to the git mirror
of Lucene and Solr sources...

Thanks,

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_66) - Build # 5439 - Failure!

2015-12-02 Thread Dyer, James

I'm looking at this failure.  I cannot reproduce this on Linux using:

ant test -Dtests.class="*.SpellCheckComponentTest" 
-Dtests.seed=110D525A21D16B1:8944EAFF0CE17B49

Tomorrow I will try this on Windows.

James Dyer
Ingram Content Group

-Original Message-
From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
Sent: Wednesday, December 02, 2015 12:53 PM
To: ans...@apache.org; mikemcc...@apache.org; sha...@apache.org; 
romseyg...@apache.org; dev@lucene.apache.org
Subject: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_66) - Build # 5439 
- Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/5439/
Java: 64bit/jdk1.8.0_66 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.handler.component.SpellCheckComponentTest.test

Error Message:
List size mismatch @ spellcheck/suggestions

Stack Trace:
java.lang.RuntimeException: List size mismatch @ spellcheck/suggestions
at 
__randomizedtesting.SeedInfo.seed([110D525A21D16B1:8944EAFF0CE17B49]:0)
at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:837)
at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:784)
at 
org.apache.solr.handler.component.SpellCheckComponentTest.test(SpellCheckComponentTest.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)

RE: Solr Spell checker for non-english language

2015-07-07 Thread Dyer, James

Safat,

DirectSolrSpellChecker defaults to Levenshtein Distance to determine how 
closely related the query terms are versus the actual terms in the index.  (see 
https://en.wikipedia.org/wiki/Levenshtein_distance) .  This is not an 
English-specific metric and it works for many languages.

Assuming this is not appropriate for the Bangla language (sorry for my 
ignorance!), you might need to implement your own Distance metric, implementing 
the StringDistance interface.  You can specify your custom class using the 
"distanceMeasure" parameter under the SpellCheckComponent entry in 
solrconfig.xml:


   
  solr.DirectSolrSpellChecker
  fully.qualified.classname.here
  .. etc ..
   


For more information, see:  
http://lucene.apache.org/core/5_2_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html#setDistance%28org.apache.lucene.search.spell.StringDistance%29

Finally, if misplaced whitespace in the query are a problem in the Bangla, you 
may wish to consider using WordBreakSolrSpellchecker in conjunction with 
DirectSolrSpellChecker to correct these problems also.  See the main Solr 
example solrconfig.xml for more information. 
(https://github.com/apache/lucene-solr/blob/branch_5x/solr/example/files/conf/solrconfig.xml)

James Dyer
Ingram Content Group

From: Safat Siddiqui [mailto:safat...@gmail.com]
Sent: Monday, July 06, 2015 10:06 PM
To: dev@lucene.apache.org
Subject: Solr Spell checker for non-english language

Hello,
I am using Solr version 4.10.3 and trying to customize it for bangla language. 
I have already built a Bangla language stemmer for Solr indexing: It works fine.
Now I like to use Solr spell checker and suggestion functionality for Bangla 
language. Which section in "DirectSolrSpellChecker" should I modify? I can not 
find which section is causing the difference between "English" and 
"Non-english" language. A direction will be very helpful for me. Thanks in 
advance.
Regards,
Safat

--
Thanks,
Safat Siddiqui
Student
Department of CSE
Shahjalal University of Science and Technology
Sylhet, Bangladesh.

RE: [VOTE] Move trunk to Java 8

2014-09-15 Thread Dyer, James

+1 to stay on 1.7, from another small scale committer.

I do not see any compelling reason to upgrade to 1.8, except as Benson says for 
"minor programming conveniences".

My company would be one of the ones you'd be shutting out if we were on 1.8 
now.  (Some of our apps upgraded to 1.7 this year.)  Of course there is 4.x, 
but what is 1.8 going to buy 5.x  that you're willing to significantly shrink 
the potential user base?

James Dyer
Ingram Content Group
(615) 213-4311

From: Benson Margulies [mailto:bimargul...@gmail.com]
Sent: Friday, September 12, 2014 3:45 PM
To: dev@lucene.apache.org
Subject: Re: [VOTE] Move trunk to Java 8

"Corporate overlords" isn't helpful. Lucene is what it is because of its wide 
adoption. That includes big, small, smart, and stupid organizations. I don't 
think that an infrastructure component like Lucene needs to be 'ahead of the 
curve'. It should aim to be widely adoptable. To me, that means moving to a new 
Java requirement after we observe it is semi-ubiquitous. If 1.8 offered some 
game-changing JVM feature that would allow a giant leap forward in Lucene, then 
that would be different. So far, all I see are some minor programming 
conveniences.

However, I'm just one very small scale committer, and I've consumed enough 
oxygen on this topic.

RE: Adding Morphline support to DIH - worth the effort?

2014-06-11 Thread Dyer, James

Alexandre,

I think that writing a new entity processor for DIH is a much less risky thing 
to commit than, say, SOLR-4799.  Entity Processors work as plug-ins and they 
aren't likely to break anything else.  So a Morphline EntityProcessor is much 
more likely to be evaluated and committed.

But like anything else, you're going to need to explain what the need is and 
what this new e.p. buys the user community.   There needs to be unit tests, etc.

Besides this, if you can show how a morphline e.p. can be a step towards 
migrating away from DIH entirely, then that would be a plus.  Perhaps create a 
new solr example along the lines of the dih solr example that demonstrates to 
users this new way forward.  This would go a long way in convincing the 
community we have a viable alternative to dih.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: Tuesday, June 10, 2014 9:55 PM
To: dev@lucene.apache.org
Subject: Re: Adding Morphline support to DIH - worth the effort?

Ripples in the pond again. Spreading and dying. Understandable, but
still somewhat annoying.

So, what would be the minimal viable next step to move this
conversation forward? Something for 4.11 as opposed to 5.0?

Anyone with commit status has a feeling of what - minimal -
deliverable they would put their own weight behind?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Mon, Jun 9, 2014 at 10:50 AM, david.w.smi...@gmail.com
 wrote:
>> One of the ideas over DIH discussed earlier is making it standalone.
>
> Yeah; my beef with the DIH is that it’s tied to Solr.  But I’d rather see
> something other than the DIH outside Solr; it’s not worthy IMO.  Why have
> something Solr specific even?  A great pipeline shouldn’t tie itself to any
> end-point.  There are a variety of solutions out there that I tried.  There
> are the big 3 open-source ETLs: Kettle, Clover, Talend) and they aren’t
> quite ideal in one way or another.  And Spring-Integration.  And some
> half-baked data pipelines like OpenPipe & Open Pipeline.  I never got around
> to taking a good look at Findwise’s open-sourced Hydra but I learned enough
> to know to my surprise it was configured in code versus a config file (like
> all the others) and that's a big turn-off to me.  Today I read through most
> of the Morphlines docs and a few choice source files and I’m
> super-impressed.  But as you note it’s missing a lot of other stuff.  I
> think something great could be built using it as a core piece.
>
> ~ David Smiley
> Freelance Apache Lucene/Solr Search Consultant/Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sun, Jun 8, 2014 at 5:51 PM, Mikhail Khludnev
>  wrote:
>>
>> Jack,
>> I found your considerations quite reasonable.
>> One of the ideas over DIH discussed earlier is making it standalone. So,
>> if we start from simple Morphline UI, we can do this extraction. Then, such
>> externalized ETL, will work better with Solr Cloud than DIH works now.
>> Presumably we can reuse DIH Jdbc Datasources as a source for Morphline
>> records.
>> Still open questions in this approach are:
>> - joins/caching - seem possible with Morphlines but still there is no such
>> command
>> - delta import - scenario we don't need to forget to handle it
>> - threads (it's completely out Morphline's concerns)
>> - distributed processing - it would be great if we can partition
>> datasource eg something what's done by Scoop
>> ... what else?
>>
>>
>> On Sun, Jun 8, 2014 at 6:54 PM, Jack Krupansky 
>> wrote:
>>>
>>> I've avoided DIH like the plague since it really doesn't fit well in
>>> Solr, so I'm still baffled as to why you think we need to use DIH as the
>>> foundation for a Solr Morphlines project. That shouldn't stop you, but
>>> what's the big impediment to taking a clean slate approach to Morphlines -
>>> learn what we can from DIH, but do a fresh, clean "Solr 5.0" implementation
>>> that is not burdened from the get-go with all of DIH's baggage?
>>>
>>> Configuring DIH is one of its main problems, so blending Morphlines
>>> config into DIH config would seem to just make Morphlines less attractive
>>> than it actually is when viewed by itself.
>>>
>>> You might also consider how ManifoldCF (another Apache project) would
>>> integrate with DIH and Morphlines as well. I mean, the core use case is ETL
>>> from external data sources. And how all of this relates to Apache Flume as
>>> well.
>>>
>>> But back to the original, still unanswered, question: Why use DIH as the
>>> starting point for integrating Morphlines with Solr - unless the goal is to
>>> make Morphlines unpalatable and less approachable than even DIH itself?!
>>>
>>> Another question: What does Elasticsearch have in this area (besides
>>> "rivers")? Are they headed in the Morphlines direction as well?
>>>
>>>
>>> -- Jack Krupansky
>>>
>>>

RE: [Apache Solr] Filter query Suggester and Spellchecker

2014-01-15 Thread Dyer, James

Alessandro,

The "spellcheck.collate" feature already supports this by specifying 
"spellcheck.maxCollationTries" greater than zero.  This is useful both to 
prevent unauthorized access to data and also to guarantee that suggested 
collations will return some results.

But "maxCollationTries" accomplishes this by running the proposed collation 
queries against the index.  If you are interested in preventing unauthorized 
access only, then you can probably get better performance with a lower-level 
filter on the term level.  There is currently no way to filter the single-term 
suggestions.

I could see this as a nice enhancement, but given the current 
"maxCollationTries" support, it may have a pretty narrow use-case.

I've also thought about moving all the collate functionality to the Lucene 
level, so that clients other than Solr can take advantage of it.  Perhaps 
something along the lines of your proposal could be a work in that direction?

James Dyer
Ingram Content Group
(615) 213-4311

From: Alessandro Benedetti [mailto:benedetti.ale...@gmail.com]
Sent: Wednesday, January 15, 2014 11:53 AM
To: dev@lucene.apache.org
Subject: Re: [Apache Solr] Filter query Suggester and Spellchecker

No one? guys ?

2014/1/14 Alessandro Benedetti 
mailto:benedetti.ale...@gmail.com>>
Hi guys,
this proposal will be for an improvement.
I propose to add the chance of suggest terms ( for Spellchecking and Auto 
Suggest) based only to a subset of Documents.

In this way we can provide security implementations that will allow users to 
see suggestions of terms , only from allowed to see documents.

These are the proposed approaches :

Filter query Auto Suggest

1) retrieve the suggested tokens from the input text using the already cutting 
edge FST based suggester
2) use a similar approach of the TermEnum because
a) we have a small set of suggestions ( reasonable, because we can filter to 
5-10 suggestions max)
So the termEnum approach will be fast.
b) we can get for each suggested token the posting list and make the 
intersection with the resulting DocId list ( from the filter query), if null, 
not return the suggestion.

Filter query Spellcheck

1) we can use the already cutting edge FSA based direct index spellchecker and 
get the suggestions
2) use a similar approach of the TermEnum because
a) we have a small set of suggestions ( reasonable, because we can filter to 
5-10 suggestions max)
So the termEnum approach will be fast.
b) we can get for each suggested token the posting list and make the 
intersection with the resulting DocId list ( from the filter query), if null, 
not return the suggestion.

Of course we will have to add a further parameter in the request handler, 
something like :
spellcheck.qf

Let me know your impression and ideas,

Cheers





--
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England



--
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

RE: have developer question about ClobTransformer and DIH

2013-05-21 Thread Dyer, James

Chris,

I'm basing my opinion here on this statement,

"The method ResultSet.getString, which allocates and returns a new String 
object, is recommended for retrieving data from CHAR, VARCHAR, and LONGVARCHAR 
fields."

from section 9.3.1 of this document:

http://docs.oracle.com/javase/1.4.2/docs/guide/jdbc/getstart/mapping.html

I realize this is from 1.4.2 but I could not find a newer version of this 
document.  I would not expect (blindly this time) it to have changed in a 
backwards-incompatible way.  In the end, of course, it really depends on 
however a particular jdbc driver's implementation.  If an obscure database's 
jdbc driver--after the user did a funny workaround to get some other tool to 
work--is returning bytes or hex addresses or whatever, but if you can solve it 
with a cast...why would we want to modify our code to make this particular case 
work more smoothly?

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Tuesday, May 21, 2013 10:28 AM
To: dev@lucene.apache.org
Subject: RE: have developer question about ClobTransformer and DIH


: If your database is indeed sending this as a LONGVARCHAR, I would expect 
: a default "resultset.getString(index)" to correctly get text from a 
: LONGVARCHAR column.

James: how certain is your expecation?

Based on the sparse mentions of LONGVARCHAR in the ResultSet class docs, 
i'm not convinced getString() will do the right thing

http://docs.oracle.com/javase/6/docs/api/java/sql/ResultSet.html#getAsciiStream%28int%29


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: have developer question about ClobTransformer and DIH

2013-05-21 Thread Dyer, James

Yes, that is correct.  So it is going to do "resultSet.getString(zzz)" for any 
type it cannot address with the case statement.  This should be fine if your db 
is returning a LONGVARCHAR.  I see in the code also if you specify  it will do resultSet.getObject(zzz) on everything.  
I doubt it, but this might address your problem in the case of a jdbc driver 
doing something out of the ordinary.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: geeky2 [mailto:gee...@hotmail.com] 
Sent: Tuesday, May 21, 2013 9:58 AM
To: dev@lucene.apache.org
Subject: RE: have developer question about ClobTransformer and DIH


james,

just trying to learn more about the source code,

looking at the JdbcDataSource.java, it looks like this is the default
behavior of the case statement in method getARow()

default:
  result.put(colName, resultSet.getString(colName));
  break;




--
View this message in context: 
http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064934.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: have developer question about ClobTransformer and DIH

2013-05-21 Thread Dyer, James

I think this code snippet attempts to map the schema.xml types to database 
types.  

If your database is indeed sending this as a LONGVARCHAR, I would expect a 
default "resultset.getString(index)" to correctly get text from a LONGVARCHAR 
column.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: geeky2 [mailto:gee...@hotmail.com] 
Sent: Tuesday, May 21, 2013 9:09 AM
To: dev@lucene.apache.org
Subject: RE: have developer question about ClobTransformer and DIH

>>
Since i don't see Types.LONGVARCHAR mentioned anywhere in the DIH code
base, i suspect it's falling back to some default behavior assuming
String data which doesn't account for the way LONGVARCHAR data is
probably returned as an Object that needs to be streamed similar to a
Clob. 
<<

could this be the default behaviour?

   for (Map map : context.getAllEntityFields()) {
  String n = map.get(DataImporter.COLUMN);
  String t = map.get(DataImporter.TYPE);
  if ("sint".equals(t) || "integer".equals(t))
fieldNameVsType.put(n, Types.INTEGER);
  else if ("slong".equals(t) || "long".equals(t))
fieldNameVsType.put(n, Types.BIGINT);
  else if ("float".equals(t) || "sfloat".equals(t))
fieldNameVsType.put(n, Types.FLOAT);
  else if ("double".equals(t) || "sdouble".equals(t))
fieldNameVsType.put(n, Types.DOUBLE);
  else if ("date".equals(t))
fieldNameVsType.put(n, Types.DATE);
  else if ("boolean".equals(t))
fieldNameVsType.put(n, Types.BOOLEAN);
  else if ("binary".equals(t))
fieldNameVsType.put(n, Types.BLOB);
 * else
fieldNameVsType.put(n, Types.VARCHAR);*
}




--
View this message in context: 
http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064916.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: svn commit: r1484015 - in /lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext: ./ jcl-over-slf4j-1.6.6.jar jul-to-slf4j-1.6.6.jar log4j-1.2.16.jar slf4j-api-1.6.6.jar slf4j-log4j12-1.6.6.jar

2013-05-20 Thread Dyer, James

My apologies.  I will revert now.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com] 
Sent: Monday, May 20, 2013 2:48 PM
To: Lucene/Solr dev
Subject: Re: svn commit: r1484015 - in 
/lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext: ./ 
jcl-over-slf4j-1.6.6.jar jul-to-slf4j-1.6.6.jar log4j-1.2.16.jar 
slf4j-api-1.6.6.jar slf4j-log4j12-1.6.6.jar

James,

I'm assuming this was a mistake?  Can you revert it?  Thanks.

Mike McCandless

http://blog.mikemccandless.com


On Sat, May 18, 2013 at 4:41 AM, Uwe Schindler  wrote:
> What happened here?:
> - We don't use 4.2 branch anymore
> - Please don't commit JAR files
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: jd...@apache.org [mailto:jd...@apache.org]
>> Sent: Saturday, May 18, 2013 12:13 AM
>> To: comm...@lucene.apache.org
>> Subject: svn commit: r1484015 - in
>> /lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext: ./ jcl-over-
>> slf4j-1.6.6.jar jul-to-slf4j-1.6.6.jar log4j-1.2.16.jar slf4j-api-1.6.6.jar 
>> slf4j-
>> log4j12-1.6.6.jar
>>
>> Author: jdyer
>> Date: Fri May 17 22:13:05 2013
>> New Revision: 1484015
>>
>> URL: http://svn.apache.org/r1484015
>> Log:
>> initial buy
>>
>> Added:
>> lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/
>> lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jcl-over-
>> slf4j-1.6.6.jar   (with props)
>> lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jul-to-slf4j-
>> 1.6.6.jar   (with props)
>> lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/log4j-
>> 1.2.16.jar   (with props)
>> lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j-api-
>> 1.6.6.jar   (with props)
>> lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j-log4j12-
>> 1.6.6.jar   (with props)
>>
>> Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jcl-
>> over-slf4j-1.6.6.jar
>> URL:
>> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e
>> xample/lib/ext/jcl-over-slf4j-1.6.6.jar?rev=1484015&view=auto
>> ==
>> 
>> Binary file - no diff available.
>>
>> Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/jul-to-
>> slf4j-1.6.6.jar
>> URL:
>> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e
>> xample/lib/ext/jul-to-slf4j-1.6.6.jar?rev=1484015&view=auto
>> ==
>> 
>> Binary file - no diff available.
>>
>> Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/log4j-
>> 1.2.16.jar
>> URL:
>> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e
>> xample/lib/ext/log4j-1.2.16.jar?rev=1484015&view=auto
>> ==
>> 
>> Binary file - no diff available.
>>
>> Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j-
>> api-1.6.6.jar
>> URL:
>> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e
>> xample/lib/ext/slf4j-api-1.6.6.jar?rev=1484015&view=auto
>> ==
>> 
>> Binary file - no diff available.
>>
>> Added: lucene/dev/branches/lucene_solr_4_2/solr/example/lib/ext/slf4j-
>> log4j12-1.6.6.jar
>> URL:
>> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_2/solr/e
>> xample/lib/ext/slf4j-log4j12-1.6.6.jar?rev=1484015&view=auto
>> ==
>> 
>> Binary file - no diff available.
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: have developer question about ClobTransformer and DIH

2013-05-20 Thread Dyer, James

I think you're confusing the hierarchy of your database's types with the 
hierarchy in Java.  In Java, a java.sql.Blob and a java.sql.Clob are 2 
different things.  They do not extend a common ancestor (excpt 
java.lang.Object).  To write code that deals with both means you need to have 
separate paths for each object type.  There is no way around this.  (Compare 
the situation with Integer, Float, BigDecimal, etc, which all extend 
Number...In this case, your jdbc code can just expect a Number back from the 
database regardless of what object a particular jdbc driver decided to return 
to you.)

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: geeky2 [mailto:gee...@hotmail.com] 
Sent: Friday, May 17, 2013 9:01 PM
To: dev@lucene.apache.org
Subject: RE: have developer question about ClobTransformer and DIH


i still have a disconnect on this (see below)

i have been reading on the informix site about BLOB, CLOB and TEXT types.

*i miss-stated earlier that a TEXT type "is another type of informix blob" -
after reading the dox - this is not true.*

>>
I think what it comes down to is that a Clob "is-not-a" Blob.  
<<

the informix docs indicate the opposite.  CLOB and BLOB are sub-classes of
smart object types.

what is a smart object type (the super class for BLOB and CLOB):

  
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm

what is a BLOB type:

  
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm


what is a CLOB type:

  
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm

what is a TEXT type:

  
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlr.doc/sqlrmst136.htm



after reading the above - my disconnect lies with the following:

if an informix TEXT type is basically text - then why did solr return the
two TEXT fields as binary addresses,  when i removed all references to
ClobTransformer and the clob="true" switches from the fields in the
db-config.xml file??

if TEXT is just text, then there should be no need to leverage
ClobTransformer and to "cast" TEXT type fields as CLOBs.

see my earlier post on the solr users group for the detail:

http://lucene.472066.n3.nabble.com/having-trouble-storing-large-text-blob-fields-returns-binary-address-in-search-results-td4063979.html#a4064260


mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064323.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: have developer question about ClobTransformer and DIH

2013-05-17 Thread Dyer, James

I think what it comes down to is that a Clob "is-not-a" Blob.  So any code 
dealing with Clobs that also wants to deal with Blobs and do the same thing 
with them is going to need to first check the object type returned from the 
jdbc driver then do separate logic depending on the object type returned.  
Specifically, if a java.sql.Clob, it needs to call "getCharacterStream", but if 
a java.sql.Blob, "getBinaryStream".  Possibly there are other gotchas about 
making assumptions about the binary stream?  Then again, if a user uses 
"ClobTransformer" on a "Blob", then perhaps you can assume all you want about 
what the binary stream is going to be?

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: geeky2 [mailto:gee...@hotmail.com] 
Sent: Friday, May 17, 2013 4:44 PM
To: dev@lucene.apache.org
Subject: RE: have developer question about ClobTransformer and DIH

Hello James,

>>
I think the usual practice is to use BLOB types to store data that is not a
character stream.  So you case is probably pretty rare
<<

admittedly - if the fields had been left as clob fields, then all would have
been well.  the change to informix Text blobs was driven by the need to use
the informix dbload utility, to push data in to the target table before
using the DIH to pull data from the target table in to the core.

>>
If casting solves the issue, then why not? 
<<

ok - i will concede this point - but i am interested in "why"
ClobTransformer _needs_ the cast to work in the first place.  

>>
Then again if CLOBTransformer was changed to handle BLOBs also, I do not see
the harm
<<

if possible - i would like to understand more about ClobTransformer and what
would be needed to make that change.

>>
But I would think it would be a much more common case that users would be
putting binary-format documents in BLOBs then feeding them to tika or
something to extract the text. 
<<

i am not sure - maybe.  at SHC (Sears) the data being stored in these two
columns is a large JSON blob.  when a query is performed, the JSON blob is
parsed and used as needed.

thanks again for the discussion and education.


mark








--
View this message in context: 
http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256p4064289.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: have developer question about ClobTransformer and DIH

2013-05-17 Thread Dyer, James

I think the usual practice is to use BLOB types to store data that is not a 
character stream.  So you case is probably pretty rare.  If casting solves the 
issue, then why not?  I think people use casts all the time to solve these 
types of compatibility issues.  Then again if CLOBTransformer was changed to 
handle BLOBs also, I do not see the harm.  But I would think it would be a much 
more common case that users would be putting binary-format documents in BLOBs 
then feeding them to tika or something to extract the text.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: geeky2 [mailto:gee...@hotmail.com] 
Sent: Friday, May 17, 2013 1:34 PM
To: dev@lucene.apache.org
Subject: have developer question about ClobTransformer and DIH

hello,

this is my first post to this forum - 

if this question is not correct for this forum (or has been addressed in
another jira) - just let me know ;)

environment: 
   solr 3.5
   informix 11.x
   centos
   
Problem statement: ClobTransformer
(./solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/ClobTransformer.java)
stopped working when two columns in the table were converted from CLOB to
Text.

More Detail:
recently i ran in to an issue while attempting to use the DIH against an
informix table.  the DIH and ClobTransfomer were working well with two (2)
fields that were defined as CLOB.  to resolve another informix specific
issue - the two fields were changed to Text fields (another type of informix
blob).

after the change - another full import was done and it was discovered that
these two fields were being returned with the classic hex address that
denotes a binary field in the schema.

after quite a bit of experimentation and discussion with the DBA's, i cast
the two columns as clob.

example:

cast(att.attr_val AS clob) as attr_val,
cast(rsr.rsr_val AS clob) as rsr_val,

after doing this - the issue was resolved.

Questions:

1) is this a known issue?
2) is this the prescribed remedy for this type of situation - using this
version of solr (3.5)?
3) can i get more detail on why the ClobTransfomer does not work with other
blob like fields?

finally - i looked at the code for ClobTransformer (and Transformer) and was
wondering if it is possible to change or add another class that would handle
this use case "out of the box".

thx
mark






--
View this message in context: 
http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-tp4064256.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [Discussion] Discontinue the ROLLBACK command in Solr?

2013-05-09 Thread Dyer, James

We use rollback (in conjunction with DIH) when doing full re-imports on a 
traditional "non-cloud" index.  As DIH first deletes all documents then adds 
them all, its handy for it to roll back its changes if something goes wrong.  
Then the indexing node can simply return to service executing queries until the 
problem is solved, etc.  

Would it be acceptable to retain rollback for non-cloud indexes that do not 
have atomic updates, etc enabled?  We could even put an "enable rollback" 
parameter in the config that is by default turned off so users can be made to 
think about it before turning it on, etc.

Of course, if rollback was removed, the workaround is to take a backup, then 
attempt reindex, then replace the backup on failure.  This is custom scripting 
that is currently done automatically.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Wednesday, May 08, 2013 8:08 PM
To: dev@lucene.apache.org
Subject: Re: [Discussion] Discontinue the ROLLBACK command in Solr?


: Many are confused about the rollback feature in Solr, since it cannot 
: guarantee a rollback of updates from that client since last commit.
: 
: In my opinion it is pretty useless to have a rollback feature you cannot 
: rely upon - Unless, that is, you are the only client for sure, having no 
: autoCommit, and a huge RAMbuffer.
: 
: So why don't we simply deprecate the feature in 4.x and remove it from 5.0?

+1 ... i don't remember the details of how/why/where rollback works, but 
as i understand it, there are some serious caveats to it's usage, as well 
as some bugs that may not have any viable/simple solutions (at least as 
far as i know of).

example...
https://issues.apache.org/jira/browse/SOLR-4733


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Mini-proposal: Standalone Solr DIH and SolrCell jars

2013-03-25 Thread Dyer, James

Sameday it would be nice to see DIH be able to run in its own JVM for just the 
reason Jack mentions.  There are quite a few neat things like this that could 
be done with DIH, but I've tried to work more on improving the tests, fixing 
bugs, and generally making the code more attractive to developers.  I don't 
think DIH has a chance to really "grow up" until these types of things get done.

I know nothing about solr cell except a few people on the mailing list have 
been burned trying to run it in production only to learn that it doesn't scale. 
 At least that's the general gist I've heard: "for prototyping purposes only".  
Maybe if it is re-architectured as a stand-alone app it would fare better?

James Dyer
Ingram Content Group
(615) 213-4311

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, March 22, 2013 9:07 PM
To: dev@lucene.apache.org
Subject: Re: Mini-proposal: Standalone Solr DIH and SolrCell jars

On 3/22/2013 7:04 PM, Jack Krupansky wrote:
> I wanted to get some preliminary feedback before filing this proposal as
> a Jira(s):
>  
> Package Solr Data Import Handler and Solr Cell as standalone jars with
> command line interfaces to run as separate processes to promote more
> efficient distributed processing, both by separating them from the Solr
> JVM and allowing multiple instances running in parallel on multiple
> machines. And to make it easier for mere mortals to customize the
> ingestion code without diving deep into core Solr.

That's a really interesting idea.  You mentioned having them be grown-up
siblings of the SimplePostTool, which would imply that the jar would be
directly executable.  What would be the mechanism for configuring it and
getting DIH status?

An alternate idea, if it's feasible, would be that you could drop the
jar and its dependencies into a lib directory and embed into an index
update application.  Hopefully it is only tied to SolrJ, not deep Solr
or Lucene internals.  I haven't checked.

Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: 4.1 release notes: please review

2013-01-17 Thread Dyer, James

Ok I have it in the wiki in its own section but I condensed it.  Feel free to 
edit further as you desire.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Thursday, January 17, 2013 10:59 AM
To: dev@lucene.apache.org
Subject: Re: 4.1 release notes: please review

My take on release notes is that they mainly talk about new features/big 
changes, and if other things are mentioned, they are only mentioned briefly.

But if you think it's worth its own section, go for it.

Steve

On Jan 17, 2013, at 11:47 AM, "Dyer, James"  
wrote:

> Do you think it is appropriate that we put all of this in a section in the 
> release notes, or something more succinct?
> 
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
> 
> 
> -Original Message-
> From: Steve Rowe [mailto:sar...@gmail.com] 
> Sent: Thursday, January 17, 2013 10:36 AM
> To: dev@lucene.apache.org
> Subject: Re: 4.1 release notes: please review
> 
> Hi James,
> 
> Please go ahead edit the wiki page - I'm sure you'll do a better job of 
> summarizing these than me.
> 
> Steve
> 
> On Jan 17, 2013, at 11:31 AM, "Dyer, James"  
> wrote:
> 
>> Steve, This is pretty longwinded and maybe just the first sentence will 
>> suffice with "see the wiki for more information."  All of this is documented 
>> there, more or less.  None of this will affect very many people.
>> 
>> --
>> The DataImportHandler contrib module has some minor backwards-compatibility 
>> breaks in this release.
>> 
>> 1. Both NumberFormatTransformer & DateFormatTransformer default to the 
>> "root" locale if none is specified.  Prior versions used the JVM default 
>> locale.  It is strongly advised that users always specify the locale when 
>> using this transformer. See https://issues.apache.org/jira/browse/SOLR-4095
>> 
>> 2. Both FileDataSoruce and FieldReaderDataSource default to UTF-8 encoding 
>> if none is specified.  Prior versions used the JVM default.  See 
>> https://issues.apache.org/jira/browse/SOLR-4096 .  Also, the behavior of 
>> DataSource and encoding may change again in a subsequent release.  See 
>> https://issues.apache.org/jira/browse/SOLR-2347 .
>> 
>> 3. The "formatDate" evaluator now defaults to using the "root" locale.  
>> Prior versions used the JVM default.  Both the locale & timezone now can be 
>> specified using new optional parameters.  See 
>> https://issues.apache.org/jira/browse/SOLR-4086 & 
>> https://issues.apache.org/jira/browse/SOLR-2201 .
>> 
>> 4. The "dataimport.properties" file, which holds the last indexed timestamp 
>> for use with delta imports, is now by default using the "root" locale.  This 
>> default can be overridden using the new  tag in 
>> data-config.xml.  Prior versions used the default JVM locale.  This is only 
>> of concern if your default locale uses different DataFormatSymbols than the 
>> "root" locale and if your installation depends on these alternate symbols 
>> (for instance if your RDMBS takes dates using your locale-specific date 
>> symbols). See https://issues.apache.org/jira/browse/SOLR-4051
>> 
>> 5. The experimental DIHProperties interface has changed, and is now an 
>> abstract class.  This will require code changes for anyone who has a custom 
>> DIHProperties.  Also note that future API changes with this class are 
>> possible in subsequent releases.  See 
>> https://issues.apache.org/jira/browse/SOLR-4051
>> 
>> 6. The Evaluator framework has received extensive refactoring.  Some custom 
>> evaluators may require code changes.  Specifically, public or protected 
>> methods from the EvaluatorBag class have been moved to the Evaluator 
>> abstract class that all Evalutators must extend.  See 
>> https://issues.apache.org/jira/browse/SOLR-4086
>> --
>> 
>> James Dyer
>> E-Commerce Systems
>> Ingram Content Group
>> (615) 213-4311
>> 
>> 
>> -Original Message-
>> From: Steve Rowe [mailto:sar...@gmail.com] 
>> Sent: Thursday, January 17, 2013 1:26 AM
>> To: dev@lucene.apache.org
>> Subject: 4.1 release notes: please review
>> 
>> I took a crack at the Solr release note.
>> 
>> I added CommonTermsQuery to the Lucene release note that Robert has been 
>> maintaining - looks good to me otherwise.
>> 
>> Please help me whip these into shape.
>> 
>> Solr: http://wiki.apache.org/solr/ReleaseNote41

RE: 4.1 release notes: please review

2013-01-17 Thread Dyer, James

Do you think it is appropriate that we put all of this in a section in the 
release notes, or something more succinct?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Thursday, January 17, 2013 10:36 AM
To: dev@lucene.apache.org
Subject: Re: 4.1 release notes: please review

Hi James,

Please go ahead edit the wiki page - I'm sure you'll do a better job of 
summarizing these than me.

Steve

On Jan 17, 2013, at 11:31 AM, "Dyer, James"  
wrote:

> Steve, This is pretty longwinded and maybe just the first sentence will 
> suffice with "see the wiki for more information."  All of this is documented 
> there, more or less.  None of this will affect very many people.
> 
> --
> The DataImportHandler contrib module has some minor backwards-compatibility 
> breaks in this release.
> 
> 1. Both NumberFormatTransformer & DateFormatTransformer default to the "root" 
> locale if none is specified.  Prior versions used the JVM default locale.  It 
> is strongly advised that users always specify the locale when using this 
> transformer. See https://issues.apache.org/jira/browse/SOLR-4095
> 
> 2. Both FileDataSoruce and FieldReaderDataSource default to UTF-8 encoding if 
> none is specified.  Prior versions used the JVM default.  See 
> https://issues.apache.org/jira/browse/SOLR-4096 .  Also, the behavior of 
> DataSource and encoding may change again in a subsequent release.  See 
> https://issues.apache.org/jira/browse/SOLR-2347 .
> 
> 3. The "formatDate" evaluator now defaults to using the "root" locale.  Prior 
> versions used the JVM default.  Both the locale & timezone now can be 
> specified using new optional parameters.  See 
> https://issues.apache.org/jira/browse/SOLR-4086 & 
> https://issues.apache.org/jira/browse/SOLR-2201 .
> 
> 4. The "dataimport.properties" file, which holds the last indexed timestamp 
> for use with delta imports, is now by default using the "root" locale.  This 
> default can be overridden using the new  tag in 
> data-config.xml.  Prior versions used the default JVM locale.  This is only 
> of concern if your default locale uses different DataFormatSymbols than the 
> "root" locale and if your installation depends on these alternate symbols 
> (for instance if your RDMBS takes dates using your locale-specific date 
> symbols). See https://issues.apache.org/jira/browse/SOLR-4051
> 
> 5. The experimental DIHProperties interface has changed, and is now an 
> abstract class.  This will require code changes for anyone who has a custom 
> DIHProperties.  Also note that future API changes with this class are 
> possible in subsequent releases.  See 
> https://issues.apache.org/jira/browse/SOLR-4051
> 
> 6. The Evaluator framework has received extensive refactoring.  Some custom 
> evaluators may require code changes.  Specifically, public or protected 
> methods from the EvaluatorBag class have been moved to the Evaluator abstract 
> class that all Evalutators must extend.  See 
> https://issues.apache.org/jira/browse/SOLR-4086
> --
> 
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
> 
> 
> -Original Message-
> From: Steve Rowe [mailto:sar...@gmail.com] 
> Sent: Thursday, January 17, 2013 1:26 AM
> To: dev@lucene.apache.org
> Subject: 4.1 release notes: please review
> 
> I took a crack at the Solr release note.
> 
> I added CommonTermsQuery to the Lucene release note that Robert has been 
> maintaining - looks good to me otherwise.
> 
> Please help me whip these into shape.
> 
> Solr: http://wiki.apache.org/solr/ReleaseNote41
> 
> Lucene: http://wiki.apache.org/lucene-java/ReleaseNote41
> 
> Thanks,
> Steve 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: 4.1 release notes: please review

2013-01-17 Thread Dyer, James

Steve, This is pretty longwinded and maybe just the first sentence will suffice 
with "see the wiki for more information."  All of this is documented there, 
more or less.  None of this will affect very many people.

--
The DataImportHandler contrib module has some minor backwards-compatibility 
breaks in this release.

1. Both NumberFormatTransformer & DateFormatTransformer default to the "root" 
locale if none is specified.  Prior versions used the JVM default locale.  It 
is strongly advised that users always specify the locale when using this 
transformer. See https://issues.apache.org/jira/browse/SOLR-4095

2. Both FileDataSoruce and FieldReaderDataSource default to UTF-8 encoding if 
none is specified.  Prior versions used the JVM default.  See 
https://issues.apache.org/jira/browse/SOLR-4096 .  Also, the behavior of 
DataSource and encoding may change again in a subsequent release.  See 
https://issues.apache.org/jira/browse/SOLR-2347 .

3. The "formatDate" evaluator now defaults to using the "root" locale.  Prior 
versions used the JVM default.  Both the locale & timezone now can be specified 
using new optional parameters.  See 
https://issues.apache.org/jira/browse/SOLR-4086 & 
https://issues.apache.org/jira/browse/SOLR-2201 .

4. The "dataimport.properties" file, which holds the last indexed timestamp for 
use with delta imports, is now by default using the "root" locale.  This 
default can be overridden using the new  tag in 
data-config.xml.  Prior versions used the default JVM locale.  This is only of 
concern if your default locale uses different DataFormatSymbols than the "root" 
locale and if your installation depends on these alternate symbols (for 
instance if your RDMBS takes dates using your locale-specific date symbols). 
See https://issues.apache.org/jira/browse/SOLR-4051

5. The experimental DIHProperties interface has changed, and is now an abstract 
class.  This will require code changes for anyone who has a custom 
DIHProperties.  Also note that future API changes with this class are possible 
in subsequent releases.  See https://issues.apache.org/jira/browse/SOLR-4051

6. The Evaluator framework has received extensive refactoring.  Some custom 
evaluators may require code changes.  Specifically, public or protected methods 
from the EvaluatorBag class have been moved to the Evaluator abstract class 
that all Evalutators must extend.  See 
https://issues.apache.org/jira/browse/SOLR-4086
--

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Thursday, January 17, 2013 1:26 AM
To: dev@lucene.apache.org
Subject: 4.1 release notes: please review

I took a crack at the Solr release note.

I added CommonTermsQuery to the Lucene release note that Robert has been 
maintaining - looks good to me otherwise.

Please help me whip these into shape.

Solr: http://wiki.apache.org/solr/ReleaseNote41

Lucene: http://wiki.apache.org/lucene-java/ReleaseNote41

Thanks,
Steve 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Possible bug in Solr SpellCheckComponent if more than one QueryConverter class is present

2013-01-14 Thread Dyer, James

Jack,

Did you test this to see if you could trigger this bug?  But in any case, can 
you open a jira ticket so this won't fall under the radar?  Even if the comment 
that was put here is true I guess we should minimally throw an exception, or 
use the first one and log a warning, maybe?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Sunday, January 13, 2013 1:24 PM
To: Lucene/Solr Dev
Subject: Possible bug in Solr SpellCheckComponent if more than one 
QueryConverter class is present

Reading through the code for Solr SpellCheckComponent.java for 4.1, it looks 
like it neither complains nor defaults reasonably if more than on 
QueryConverter class is present in the Solr lib directories:

Map queryConverters = new HashMap();
core.initPlugins(queryConverters,QueryConverter.class);

//ensure that there is at least one query converter defined
if (queryConverters.size() == 0) {
  LOG.info("No queryConverter defined, using default converter");
  queryConverters.put("queryConverter", new SpellingQueryConverter());
}

//there should only be one
if (queryConverters.size() == 1) {
  queryConverter = queryConverters.values().iterator().next();
  IndexSchema schema = core.getSchema();
  String fieldTypeName = (String) initParams.get("queryAnalyzerFieldType");
  FieldType fieldType = schema.getFieldTypes().get(fieldTypeName);
  Analyzer analyzer = fieldType == null ? new 
WhitespaceAnalyzer(core.getSolrConfig().luceneMatchVersion)
  : fieldType.getQueryAnalyzer();
  //TODO: There's got to be a better way!  Where's Spring when you need it?
  queryConverter.setAnalyzer(analyzer);
}

No else! And queryConverter is not initialized, except for that code path where 
there was zero or one QueryConverter class.

-- Jack Krupansky

RE: DIH - Using temporary Config from Request-Parameter, partial broken?

2013-01-11 Thread Dyer, James

I don't know of anything that doesn't currently work with reloading the config 
files.  I'm not sure the unit test on it handles the case where both "config" 
and "dataConfig" are specified.  I guess I don't know what happens in that 
case.  Maybe that could be the bug?  I did see where the response for having no 
config at all would be better as a 404 not a 200, and I agree with that.

Also, I don't want to discourage you from including the debugger if you've 
already done (most of) the work on the front-end.  If the work is done, someone 
out there will appreciate it.  I just didn't imagine this would get fixed so 
quickly and then if few people complained we could just deep-six the feature.  
If it survives, then possibly the backend code can be improved, tests can be 
written, it can be better documented, etc.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Stefan Matheis [mailto:matheis.ste...@gmail.com] 
Sent: Friday, January 11, 2013 4:21 PM
To: dev@lucene.apache.org
Subject: Re: DIH - Using temporary Config from Request-Parameter, partial 
broken?

Hey James

Thanks for the quick reply!

I already read the comments on SOLR-2115, was just not sure if the case was not 
describe and therefore not really existing or maybe just forgotten - so to 
confirm, what doesn't work is, having a configfile defined in solrconfig and 
still overwrite that with a provided configuration by request, right?

On Friday, January 11, 2013 at 11:06 PM, Dyer, James wrote:
> I don't mean to discourage you but I was kinda hoping if SOLR-4151 was left 
> for dead long enough and people got used to it not being there we could just 
> kill DebugLogger...

I'm completely fine with that James, no worries :) I never used it my self, i 
just took it into work while working on other dataimport things at the UI .. if 
you like to drop that, we could easily revert that port of the work and avoid 
people start using it (again) because "it's there"

Stefan

On Friday, January 11, 2013 at 11:06 PM, Dyer, James wrote:

> The behavior was changed in 4.0-ALPHA with SOLR-2115. See especially my 
> comment from July 20, 2012. There are 3 important changes here:
> - you can specify a new data-config.xml filename or location on the request 
> using the "config" parameter. You do not need to put one in solrconfig.xml, 
> but still may to have a default.
> - As an alternate to using a data-config.xml file, you can always pass a full 
> configuration on the request using the "dataConfig" parameter. You used to be 
> able to do that only if in "debug" mode.
> - the data-config.xml is always parsed and re-loaded with each import. This 
> makes it unnecessary to issue "reload-config" every time you want to use a 
> new configuration. I think "debug" mode used to do this also, but now it 
> always does this.
>  
> Although I'm probably the one person doing the most work on DIH code 
> currently, I've never used the interactive debug mode. Its sort of documented 
> a little at 
> http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mode . 
> The most important aspect of it is it activates all of that "DebugLogger" 
> code that is everywhere cluttering up DIH. I think the interactive screens 
> are supposed to take in all of those log messages and do something with them 
> graphically for the user.
>  
> I don't mean to discourage you but I was kinda hoping if SOLR-4151 was left 
> for dead long enough and people got used to it not being there we could just 
> kill DebugLogger...
>  
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>  
>  
> -Original Message-
> From: Stefan Matheis [mailto:matheis.ste...@gmail.com]  
> Sent: Friday, January 11, 2013 3:36 PM
> To: dev@lucene.apache.org (mailto:dev@lucene.apache.org)
> Subject: DIH - Using temporary Config from Request-Parameter, partial broken?
>  
> Hey Guys
>  
> While working on SOLR-4151 (DIH 'debug' mode missing from 4.x UI) i skimmed 
> through the code and found this one:
>  
> 129 | if (DataImporter.SHOW_CONF_CMD.equals(command)) {  
> 130 | String dataConfigFile = params.get("config");
> 131 | String dataConfig = params.get("dataConfig");
> 132 | if(dataConfigFile != null) {
> 133 | dataConfig = 
> SolrWriter.getResourceAsString(req.getCore().getResourceLoader().openResource(dataConfigFile));
> 134 | }
> 135 | if(dataConfig==null) {
> 136 | rsp.add("status", DataImporter.MSG.NO_CONFIG_FOUND);
> 137 | } else {
> 138 | // Modify incoming request params to add wt=raw
>  
>  
>  
>  
>  
>  
> from

RE: DIH - Using temporary Config from Request-Parameter, partial broken?

2013-01-11 Thread Dyer, James

The behavior was changed in 4.0-ALPHA with SOLR-2115.  See especially my 
comment from July 20, 2012.  There are 3 important changes here:
- you can specify a new data-config.xml filename or location on the request 
using the "config" parameter.  You do not need to put one in solrconfig.xml, 
but still may to have a default.
- As an alternate to using a data-config.xml file, you can always pass a full 
configuration on the request using the "dataConfig" parameter.  You used to be 
able to do that only if in "debug" mode.
- the data-config.xml is always parsed and re-loaded with each import.  This 
makes it unnecessary to issue "reload-config" every time you want to use a new 
configuration.  I think "debug" mode used to do this also, but now it always 
does this.

Although I'm probably the one person doing the most work on DIH code currently, 
I've never used the interactive debug mode.  Its sort of documented a little at 
http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mode .  
The most important aspect of it is it activates all of that "DebugLogger" code 
that is everywhere cluttering up DIH.  I think the interactive screens are 
supposed to take in all of those log messages and do something with them 
graphically for the user.

I don't mean to discourage you but I was kinda hoping if SOLR-4151 was left for 
dead long enough and people got used to it not being there we could just kill 
DebugLogger...

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Stefan Matheis [mailto:matheis.ste...@gmail.com] 
Sent: Friday, January 11, 2013 3:36 PM
To: dev@lucene.apache.org
Subject: DIH - Using temporary Config from Request-Parameter, partial broken?

Hey Guys

While working on SOLR-4151 (DIH 'debug' mode missing from 4.x UI) i skimmed 
through the code and found this one:

129 | if (DataImporter.SHOW_CONF_CMD.equals(command)) { 
130 | String dataConfigFile = params.get("config");
131 | String dataConfig = params.get("dataConfig");
132 | if(dataConfigFile != null) {
133 | dataConfig = 
SolrWriter.getResourceAsString(req.getCore().getResourceLoader().openResource(dataConfigFile));
134 | }
135 | if(dataConfig==null) {
136 | rsp.add("status", DataImporter.MSG.NO_CONFIG_FOUND);
137 | } else {
138 | // Modify incoming request params to add wt=raw






from 
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java?view=markup

What it *should* do, related to the Description of the Issue (SOLR-2115), is: 
accept a temporary Config (provided by a Request-Parameter) and use it instead 
of the defined one .. but, as fair as i understand the code: any provided 
Config will get overwritten if there is a ConfigFile defined in your solrconfig.

There is no check in place, that this fallback should only happen if there was 
no (temporary) configuration given .. or am i missing something really 
important but maybe not completely obvious here?

Stefan 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

failure with oal.util.TestMaxFailuresRule

2012-12-11 Thread Dyer, James

I can faithfully reproduce a failure in Trunk on this test with:

-Dtests.seed=3FACDC7EBD23CB80:3D65D783617F94F1

this happens both in Linux and Windows.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Dyer, James

I'm using subclipse with JavaHL 1.7.7.  I am unclear is JavaHL keeps its 
equivenence versioning the same as official svn versions?  I do not have an 
official svn command line installed, do not use tortise or other tools, etc.

Reading Uwe's comment that I "never merge", I do wonder if its just that I 
should let the directory property changes merge in also even if I do not 
understand them.  I just don't like to commit stuff that seems unrelated to 
what I'm doing and I don't understand.  This fits because if it appears I 
"never merge", I also "always" omit seemingly unrelated property changes when 
committing a merge.  

I also would like an answer to my question: "Is it ok to make parallel changes 
instead of a merge if its just a trivial change?"  Follow up question:  "is it 
ok to make the same (trivial) change to 2 branches with 1 commit?"  It really 
is very slow for me to merge and if the way I've handled trivial changes in the 
past breaks things for other people, I can change my ways, or just not fix tiny 
things if time doesn't allow.

Especially when I get an unexpected jenkins test failure, I'm usually in the 
middle of something else and really want to fix jenkins asap but can't always 
give it a lot of time (getting more coffee, as you might say to do, Robert) 
while waiting for svn, etc.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Monday, December 10, 2012 9:57 AM
To: dev@lucene.apache.org
Subject: Re: lost entries in trunk/lecene/CHANGES.txt

On Mon, Dec 10, 2012 at 10:53 AM, Dyer, James
 wrote:

> Perhaps the issue is when I do a merge, if I notice directories that have
> property changes only I omit them.  Should I be including these?  Often
> these are seeming random directories and I never quite understand why these
> are being included.  (Maybe its just my ignorance of svn.)  Perhaps this is
> the problem?
>

Are you using svn 1.7? I really recommend this!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Dyer, James

I do apologize for causing problems.  But I do usually merge.  However, if it 
is a trivial change (say, just a small test fix) it is a ton faster to just 
make the change to both branches instead of a merge.  I guess I do not 
understand why this causes problems with seemingly unrelated code (I can be 
pretty sure the code involved with LUCENE-4585 is entirely separate than code 
I've been modifying).  Is it really a bad thing to make a trivial change this 
way?

Perhaps the issue is when I do a merge, if I notice directories that have 
property changes only I omit them.  Should I be including these?  Often these 
are seeming random directories and I never quite understand why these are being 
included.  (Maybe its just my ignorance of svn.)  Perhaps this is the problem?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Sunday, December 09, 2012 3:59 AM
To: dev@lucene.apache.org
Subject: RE: lost entries in trunk/lecene/CHANGES.txt

Hi,

I checked a little bit in the commit logs what was going on. From what I can 
reconstruct:

-James Dyer did not use SVN merging to 4.x, he copied the whole file 
into the 4.x folder, this explains why the 5.0 changes entries suddenly 
appeared in the 4.x brach (which I removed yesterday). James seems to never 
merge his changes between branches, he applies patch several times or just 
copies files.

-The commit where the entries got lost, that Doron restored an hour 
ago, seems to have copied an older version of the CHANGES.txt file over the 
newer version in SVN. This should be impossible with SVN, unless you “svn up” 
your current Working directory and fix the conflicts by telling SVN to use the 
older modified (“your”) version instead of doing 3-way-merge. One should use 
3-way-merge to do this (e.g. with TortoiseSVN or Subclipse or by hand, arrgh 
☺). It looks like James created the patch with an older SVN checkout but failed 
to merge the changes.

James: Can you in the future please use “svn merge” (or the corresponding 
workflow in your GUI) to merge the changes between branches. This merge adds 
special “properties” to the SVN log, so one can find out which patches were 
merged between branches. E.g. TortoiseSVN or Subclipse show those in a 
different color in the commit log which helps immense if you are about to merge 
some changes. If you need some help with merging correctly, read 
http://wiki.apache.org/lucene-java/SvnMerge or just ask me.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Sunday, December 09, 2012 10:15 AM
To: dev@lucene.apache.org
Subject: RE: lost entries in trunk/lecene/CHANGES.txt

They were partly (but in a different way also missing in 4.x). I synced the 
part from version 4.1 down to version 0 with trunk. 3 entries were missing. 
Trunk now only has 5.0 as additional section, remaining stuff is identical.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

From: Doron Cohen [mailto:cdor...@gmail.com]
Sent: Sunday, December 09, 2012 9:30 AM
To: dev@lucene.apache.org
Subject: lost entries in trunk/lecene/CHANGES.txt

Hi, seems some entries were lost when committing LUCENE-4585 (Spatial 
PrefixTree based Strategies).
http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/CHANGES.txt?r1=1418005&r2=1418006&pathrev=1418006&view=diff
I think I'll just add them back...
Doron

RE: TestSqlEntityProcessorDelta failures on Policeman Jenkins

2012-12-05 Thread Dyer, James

ahhh.  I did not know that Policeman overrides (or that you could override) the 
run-test-serially setting in DIH's build.xml.  This explains everything as 
placing the file in a private temp directory rather than the default "conf" dir 
would solve the issue. 

I think then if I keep it as it is (dial back the logging, but keep the 
properties file in its own temp dir) solves this issue.  And if indeed 
"dataimport.properties" is the only thing that prevents the DIH tests to run in 
parallel, it should be an easy enough task to fix this for all the tests and 
then we can have parallel tests for dih.

Thanks a bunch to everyone in helping get this cleared up!

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Wednesday, December 05, 2012 1:31 PM
To: dev@lucene.apache.org
Subject: Re: TestSqlEntityProcessorDelta failures on Policeman Jenkins


: > James: How many JVMs does your machine use (you see this at the beginning
: > when tests start to run)?
...
: ok this is the bug. See dih's build.xml:
: 
:   
:   
: 
: The problem is: policeman jenkins server overrides this by setting the -D
: 

...and i think, in the specific case of TestSqlEntityProcessorDelta (or 
more specifically: anything extending AbstractSqlEntityProcessorTestCase) 
it looks like James fixed the bug in the test when he added the code to 
help log the state of the file...

https://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractSqlEntityProcessorTestCase.java?r1=1408873&r2=1417058

...because he has the test create a random dir for the proprtywriter to 
write the file for each test class.

right?


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Active 4.x branches?

2012-11-29 Thread Dyer, James

Whenever I want to know who "owns" a piece of code, I just look at the svn 
history to see who has been modifying it.  

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: David Smiley (@MITRE.org) [mailto:dsmi...@mitre.org] 
Sent: Thursday, November 29, 2012 8:49 AM
To: dev@lucene.apache.org
Subject: Re: Active 4.x branches?

Those are good points Yonik.  I guess I don't know what to think anymore.


Yonik Seeley-4 wrote
> On Thu, Nov 29, 2012 at 1:24 AM, David Smiley (@MITRE.org)
> <

> DSMILEY@

> > wrote:
>> Maybe we should have a
>> roster somewhere of parts of the codebase that have an owner.
> 
> Taking ownership is a mindset, and is very different from any kind of
> recognized "having ownership".
> We shouldn't tag areas as "owned" by someone, as that could discourage
> others getting involved in that area.
> It might also encourage deference to the "owner", which would also be
> a bad thing.  We sometimes naturally defer to someone with more
> experience in an area than we have, but it should continue to be on an
> informal case-by-case basis.
> 
>> It could be
>> useful to people not "in the know" on who to contact
> 
> The right contact point is this mailing list.
> There's already way to much off-list (and off IRC channel)
> collaboration that goes on IMO.
> 
> -Yonik
> http://lucidworks.com
> 
> -
> To unsubscribe, e-mail: 

> dev-unsubscribe@.apache

> For additional commands, e-mail: 

> dev-help@.apache





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Active-4-x-branches-tp4022609p4023246.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Fullmetal Jenkins: Solr4X - Build # 80 - Failure!

2012-11-21 Thread Dyer, James

By any chance is this Jenkins using Java 8 before JDK 8-ea-b65 ?  If so, then 
it might be hitting https://issues.apache.org/jira/browse/DERBY-5958, which, at 
least according to the comments, only occurs on earlier revisions of JDK 8. 

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Wednesday, November 21, 2012 8:47 AM
To: dev@lucene.apache.org
Subject: Re: Fullmetal Jenkins: Solr4X - Build # 80 - Failure!

This actually looks like another locale issue:  Caused by:
java.sql.SQLException: Supplied territory description 'sr__#Latn' is
invalid, expecting ln[_CO[_variant]]

- Mark

On Wed, Nov 21, 2012 at 12:18 AM,   wrote:
> Solr4X - Build # 80 - Failure:
>
> Check console output at http://fullmetaljenkins.org/job/Solr4X/80/ to view 
> the results.
>
> 1 tests failed.
> REGRESSION:  
> org.apache.solr.handler.dataimport.TestSimplePropertiesWriter.testSimplePropertiesWriter
>
> Error Message:
> Failed to create database 'memory:derbyDB', see the next exception for 
> details.
>
> Stack Trace:
> java.sql.SQLException: Failed to create database 'memory:derbyDB', see the 
> next exception for details.
> at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedConnection30.(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source)
> at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown 
> Source)
> at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> at java.sql.DriverManager.getConnection(DriverManager.java:579)
> at java.sql.DriverManager.getConnection(DriverManager.java:243)
> at 
> org.apache.solr.handler.dataimport.AbstractDIHJdbcTestCase.buildDatabase(AbstractDIHJdbcTestCase.java:140)
> at 
> org.apache.solr.handler.dataimport.AbstractDIHJdbcTestCase.beforeDihJdbcTest(AbstractDIHJdbcTestCase.java:93)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> at

RE: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!

2012-11-21 Thread Dyer, James

it works!  Hey...can you tell me a little about "full metal".  What is this one 
doing that "Policeman" isn't?  (Obviously something.  it found my bugs twice 
this week and the others didn't...)

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Tuesday, November 20, 2012 7:52 PM
To: dev@lucene.apache.org
Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!

I've moved to using http://fullmetaljenkins.org/ to host the Jenkins service.

Hopefully that makes it so you can visit it with your firewall - it may still 
be detected as a dynamic ip service though. We will see I guess.

- Mark

On Nov 20, 2012, at 4:36 PM, "Dyer, James"  wrote:

> Thanks Mark.  I committed a fix for this.
> 
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
> 
> 
> -Original Message-
> From: Mark Miller [mailto:markrmil...@gmail.com] 
> Sent: Tuesday, November 20, 2012 3:07 PM
> To: dev@lucene.apache.org
> Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!
> 
> Hmmm...bummer. I may be able to address that sometime soon.
> 
> The full trace below:
> 
> Error Message
> 
> expected:<[२०१२-११-१८ २०:५८]> but was:<[2012-11-18 20:58]>
> Stacktrace
> 
> org.junit.ComparisonFailure: expected:<[२०१२-११-१८ २०:५८]> but
> was:<[2012-11-18 20:58]>
>   at 
> __randomizedtesting.SeedInfo.seed([FC935E046E15B4D4:84DDD194732F51FA]:0)
>   at org.junit.Assert.assertEquals(Assert.java:125)
>   at org.junit.Assert.assertEquals(Assert.java:147)
>   at 
> org.apache.solr.handler.dataimport.TestBuiltInEvaluators.testDateFormatEvaluator(TestBuiltInEvaluators.java:127)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>   at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
>   at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>   at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>   at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>   at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>   at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
>   at 
> org.apache.luce

RE: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!

2012-11-20 Thread Dyer, James

pter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)

On Tue, Nov 20, 2012 at 3:58 PM, Dyer, James
 wrote:
> Mark,
>
> Can you tell me which test failed?  I still cannot get into Fullmetal 
> Jenkins.  Unfortunately my company's firewall blocks it due to "Dynamic DNS".
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -Original Message-
> From: Mark Miller [mailto:markrmil...@gmail.com]
> Sent: Tuesday, November 20, 2012 2:30 PM
> To: dev@lucene.apache.org
> Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!
>
> FYI locale type issue again:
>
> org.junit.ComparisonFailure: expected:<[२०१२-११-१८ २०:५८]> but 
> was:<[2012-11-18 20:58]>
> at 
> __randomizedtesting.SeedInfo.seed([FC935E046E15B4D4:84DDD194732F51FA]:0)
> at org.junit.Assert.assertEquals(Assert.java:125)
> at org.junit.Assert.assertEquals(Assert.java:147)
>
>
> On Nov 20, 2012, at 2:58 PM, nore...@fullmetaljenkins.homelinux.org wrote:
>
>> Solr4X - Build # 67 - Failure:
>>
>> Check console output at http://fullmetaljenkins.homelinux.org/job/Solr4X/67/ 
>> to view the results.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>



-- 
- Mark

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!

2012-11-20 Thread Dyer, James

Mark,

Can you tell me which test failed?  I still cannot get into Fullmetal Jenkins.  
Unfortunately my company's firewall blocks it due to "Dynamic DNS".

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Tuesday, November 20, 2012 2:30 PM
To: dev@lucene.apache.org
Subject: Re: Fullmetal Jenkins: Solr4X - Build # 67 - Failure!

FYI locale type issue again:

org.junit.ComparisonFailure: expected:<[२०१२-११-१८ २०:५८]> but was:<[2012-11-18 
20:58]>
at 
__randomizedtesting.SeedInfo.seed([FC935E046E15B4D4:84DDD194732F51FA]:0)
at org.junit.Assert.assertEquals(Assert.java:125)
at org.junit.Assert.assertEquals(Assert.java:147)


On Nov 20, 2012, at 2:58 PM, nore...@fullmetaljenkins.homelinux.org wrote:

> Solr4X - Build # 67 - Failure:
> 
> Check console output at http://fullmetaljenkins.homelinux.org/job/Solr4X/67/ 
> to view the results.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Fullmetal Jenkins: Solr4X - Build # 28 - Failure!

2012-11-19 Thread Dyer, James

I noticed this and made a subsequent fix for this test (4x: r1411348 / trunk: 
r1411334).  I'm having difficulty getting to this Jenkins so I'm not sure if 
this failure is before or after this commit?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Robert Muir [mailto:rcm...@gmail.com]
Sent: Monday, November 19, 2012 12:27 PM
To: dev@lucene.apache.org
Subject: Re: Fullmetal Jenkins: Solr4X - Build # 28 - Failure!

This test uses SimpleDateFormat ctor in several places that implicitly uses the 
default locale. (and other date handling with explicit default timezone and so 
on, I guess that might be a separate issue).
On Mon, Nov 19, 2012 at 1:20 PM, Mark Miller 
mailto:markrmil...@gmail.com>> wrote:

On Nov 19, 2012, at 1:18 PM, 
nore...@fullmetaljenkins.homelinux.org
 wrote:

> Solr4X - Build # 28 - Failure:
>
> Check console output at 
> http://fullmetaljenkins.homelinux.org:8080/job/Solr4X/28/ to view the results.

Looks like failure due to local issue:

org.junit.ComparisonFailure: expected:<[๒๕๕๕-๑๑-๑๙ ๐๐:๐๐]> but was:<[2012-11-19 
00:00]>
at 
__randomizedtesting.SeedInfo.seed([52960D776A05F024:97EE7E504322E141]:0)
at org.junit.Assert.assertEquals(Assert.java:125)
at org.junit.Assert.assertEquals(Assert.java:147)
at 
org.apache.solr.handler.dataimport.TestVariableResolver.testFunctionNamespace1(TestVariableResolver.java:152)
-
To unsubscribe, e-mail: 
dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: 
dev-h...@lucene.apache.org

RE: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #153: POMs out of sync

2012-11-13 Thread Dyer, James

Thank you!

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Tuesday, November 13, 2012 6:07 AM
To: dev@lucene.apache.org
Subject: Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #153: POMs out of sync

I committed a fix to the Maven configuration: Derby is now a DIH test dependency

On Nov 13, 2012, at 5:31 AM, Apache Jenkins Server  
wrote:
> Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/153/
> 
> 9 tests failed.
> FAILED:  
> org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta.org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta
> 
> Error Message:
> org.apache.derby.jdbc.EmbeddedDriver
> 
> Stack Trace:
> java.lang.ClassNotFoundException: org.apache.derby.jdbc.EmbeddedDriver


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2209 - Failure!

2012-11-05 Thread Dyer, James

My mistake, sorry.  I've got these tests set to @Ignore for now, with a better 
fix to follow soon.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Policeman Jenkins Server [mailto:jenk...@sd-datasolutions.de]
Sent: Monday, November 05, 2012 12:41 PM
To: dev@lucene.apache.org; jd...@apache.org; mikemcc...@apache.org
Subject: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2209 - 
Failure!

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2209/
Java: 32bit/jdk1.7.0_09 -server -XX:+UseConcMarkSweepGC

2 tests failed.
REGRESSION:  org.apache.solr.handler.PingRequestHandlerTest.testDisablingServer

Error Message:
Should have thrown a SolrException because not enabled yet

Stack Trace:
java.lang.AssertionError: Should have thrown a SolrException because not 
enabled yet
at 
__randomizedtesting.SeedInfo.seed([5F8D3F3DEBB9E6DE:8D7CAB800C63B692]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.handler.PingRequestHandlerTest.testDisablingServer(PingRequestHandlerTest.java:140)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.u

RE: Service Unavailable exceptions not logged

2012-10-30 Thread Dyer, James

Possibly better is introduce yet one more overloaded constructor with a boolean 
that suppresses logging and change PRH to use it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
Sent: Tuesday, October 30, 2012 2:32 PM
To: dev@lucene.apache.org
Subject: Re: Service Unavailable exceptions not logged

Hmmm I see. The problem I'm having is that with SolrCloud, in the case of no 
available nodes for a shard the created exception is a 503, and this is 
something I would like to see logged.
Maybe that exception code should be changed?
On Tue, Oct 30, 2012 at 4:15 PM, Dyer, James 
mailto:james.d...@ingramcontent.com>> wrote:
This was done with https://issues.apache.org/jira/browse/SOLR-2124 .  The idea 
is that it is enough to get a 1-line log whenever PingRequestHandler is hit 
(which will have the response code).  There is no need to also log a severe 
exception with a stack trace as this is not really an error condition.  So if 
you use PingRequestHandler to take nodes out of a load balancer rotation, it 
won't create huge logs.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Tomás Fernández Löbbe 
[mailto:tomasflo...@gmail.com<mailto:tomasflo...@gmail.com>]
Sent: Tuesday, October 30, 2012 1:55 PM
To: dev@lucene.apache.org<mailto:dev@lucene.apache.org>
Subject: Service Unavailable exceptions not logged

Why are service unavailable exceptions not logged? In the SolrException class, 
these error codes are specifically skipped from logging, but I don't understand 
why. This is the 'log' method of the SolrException class:

public static void log(Logger log, Throwable e) {
if (e instanceof SolrException
&& ((SolrException) e).code() == ErrorCode.SERVICE_UNAVAILABLE.code) {
  return;
}
String stackTrace = toStr(e);
String ignore = doIgnore(e, stackTrace);
if (ignore != null) {
  log.info<http://log.info>(ignore);
  return;
}
log.error(stackTrace);

  }

Tomás

RE: Service Unavailable exceptions not logged

2012-10-30 Thread Dyer, James

Maybe we could just create a second entry in the ErrorCode enum for 503, say, 
"SERVICE_UNAVAILABLE_NOT_LOGGED" and change PingRequestHandler to throw 
exceptions with this new ErrorCode...

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
Sent: Tuesday, October 30, 2012 2:32 PM
To: dev@lucene.apache.org
Subject: Re: Service Unavailable exceptions not logged

Hmmm I see. The problem I'm having is that with SolrCloud, in the case of no 
available nodes for a shard the created exception is a 503, and this is 
something I would like to see logged.
Maybe that exception code should be changed?
On Tue, Oct 30, 2012 at 4:15 PM, Dyer, James 
mailto:james.d...@ingramcontent.com>> wrote:
This was done with https://issues.apache.org/jira/browse/SOLR-2124 .  The idea 
is that it is enough to get a 1-line log whenever PingRequestHandler is hit 
(which will have the response code).  There is no need to also log a severe 
exception with a stack trace as this is not really an error condition.  So if 
you use PingRequestHandler to take nodes out of a load balancer rotation, it 
won't create huge logs.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Tomás Fernández Löbbe 
[mailto:tomasflo...@gmail.com<mailto:tomasflo...@gmail.com>]
Sent: Tuesday, October 30, 2012 1:55 PM
To: dev@lucene.apache.org<mailto:dev@lucene.apache.org>
Subject: Service Unavailable exceptions not logged

Why are service unavailable exceptions not logged? In the SolrException class, 
these error codes are specifically skipped from logging, but I don't understand 
why. This is the 'log' method of the SolrException class:

public static void log(Logger log, Throwable e) {
if (e instanceof SolrException
&& ((SolrException) e).code() == ErrorCode.SERVICE_UNAVAILABLE.code) {
  return;
}
String stackTrace = toStr(e);
String ignore = doIgnore(e, stackTrace);
if (ignore != null) {
  log.info<http://log.info>(ignore);
  return;
}
log.error(stackTrace);

  }

Tomás

RE: Service Unavailable exceptions not logged

2012-10-30 Thread Dyer, James

This was done with https://issues.apache.org/jira/browse/SOLR-2124 .  The idea 
is that it is enough to get a 1-line log whenever PingRequestHandler is hit 
(which will have the response code).  There is no need to also log a severe 
exception with a stack trace as this is not really an error condition.  So if 
you use PingRequestHandler to take nodes out of a load balancer rotation, it 
won't create huge logs.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
Sent: Tuesday, October 30, 2012 1:55 PM
To: dev@lucene.apache.org
Subject: Service Unavailable exceptions not logged

Why are service unavailable exceptions not logged? In the SolrException class, 
these error codes are specifically skipped from logging, but I don't understand 
why. This is the 'log' method of the SolrException class:

public static void log(Logger log, Throwable e) {
if (e instanceof SolrException
&& ((SolrException) e).code() == ErrorCode.SERVICE_UNAVAILABLE.code) {
  return;
}
String stackTrace = toStr(e);
String ignore = doIgnore(e, stackTrace);
if (ignore != null) {
  log.info(ignore);
  return;
}
log.error(stackTrace);

  }

Tomás

large messages from Jenkins failures

2012-08-20 Thread Dyer, James

Is there any way we can limit the size of the messages Jenkins emails this 
list?  Responsing to a "your mailbox is full" warning, I found I had 32 recent 
Jenkins messages all over 1mb (a few were >10mb).  A few weeks ago I returned 
from vacation to find my mail account partially disabled because Jenkins had 
used up most of my storage.  Maybe, if the log is more than so many lines to 
just supplies a link to it than have the whole thing in the email?  I realize a 
lot of you have unlimited storage on your email accounts, but unfortunately I 
do not.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: [Discuss] Should Solr be an AppServer agnostic WAR or require Jetty?

2012-07-13 Thread Dyer, James

>From my perspective, working for a company that uses Solr on a bunch of apps, 
>I really wish we keep it agnostic. I see the case for documenting that "our 
>testing process exclusively uses Jetty 7, we've included it in our 
>distribution, and we recommend it".  But I don't see why we need to be naming 
>our parameters "JettyThis" or "JettyThat" and telling people they've got to 
>use Jetty.  The fact is users often need to use other containers.  In my 
>company, we use Jboss 5.  That's it.  We have a big support contract for it, 
>our server admins know it, etc.  If we were forced to use Jetty, then we would 
>grudingly use it, but then our cost of ownership just went up a little.

On the other hand, expecting to test every possible container before you can 
tell people its "supported" for a standards-compliant java web-app is just 
crazy.  This is like saying that DIH's SQLEntityProcessor is only supported for 
HSQLDB because that's the one we test against, or that you can't run Lucene on 
Solaris because Uwe's Jenkins doesn't have a Solaris environment.  

Perhaps, though there is a middle ground.  Beyond telling people what we test 
and what recommend, maybe we can write a few tests that check for known bugs 
from popular servlet/j2ee containers.  Or even a wiki page that says something 
like "Some containers have this bug which can hurt in these instances.  To 
check if your container is stricken with this problem, try this..."

But in the end, the advice should be just like what we say when people ask how 
big a server they need or what to set their java heap to:  test thoroughly 
before going to production.

This is reminding me of one of my pet peeves back when we had Endeca:  they had 
3 supported OS-es.  That's it.  The fact that Solr could run in any 
standards-complaint environment was a big plus in my mind.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Friday, July 13, 2012 8:31 AM
To: dev@lucene.apache.org
Subject: Re: [Discuss] Should Solr be an AppServer agnostic WAR or require 
Jetty?

On Jul 13, 2012, at 9:19 AM, Robert Muir wrote:

>  I know the wiki used to
> say the release manager should go and manually test alternative
> containers before releasing: I refuse to do that. Its not the release
> manager's job.

That's insane anyhow :) The RM can't thorougly test each of other containers as 
a 'step' in the release process at the end of the cycle :) Absurd.

I think that basically meant just smoke test, cause it could not mean much 
more. Not sure how much good in the world that bought you, but I agree it's not 
the RM's job.

We know we have a good experience with exactly one version of one web container 
- the one we ship. We actually have been pretty public about this over the past 
couple years - we have just not changed the website. I can find a multitude of 
quotes from various Lucene/Solr committers talking about how bad an idea it is 
not to use Jetty due to a variety of issues. You are asking for a poor 
experience.

- Mark Miller
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Low pause GC for java 1.6

2012-06-30 Thread Dyer, James

Bill,

As you know, it really depends on the size of your index combined with which 
features you're using.  There is really no substitute for having a good load 
test and monitoring tool and to run multiple tests while trying different 
settings.

My guess is that you're experiencing "full" gc's, even with CMS enabled.  This 
means either your tenured ("old") generation is too small or you have the 
"-XX:CMSInitiatingOccupancyFraction" set too high (it starts the CMS too late 
and runs out of memory before it can finish).  We've found that some of the 
defaults the JVM picks and/or the general advice out there doesn't apply to an 
app like Solr, which is just a different kind of animal than the typical web 
frontend you might run in a J2ee container.

Below are the settings i am using as a starting point for our development Solr 
4.0 app.  These may or may not work for you but at least should give you a 
basic idea of how 1 other installation is configured.

Also, if you're using older grouping patches (I remember you worked on some of 
these), perhaps you're hitting some of the scalibility problems that were 
predicted for some of these?  I'm pretty sure the GA grouping features in 3.x 
solved these problems though.

Finally, you probably will get better responses on the users list than the dev 
list.  Also, other users might benefit from other answers you get, so perhaps 
you could cross-post your question.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

# Basic JVM settings.  I thing the 3g new generation size is bigger than you'd 
normally have with a typical web app but for us it makes the old gen fill up 
slower and have fewer CMS gc's.  Minor ("parnew") gc's are still fast enough 
for us, even with a biggish new gen.
-XX:MaxNewSize=3000m 
-XX:NewSize=3000m 
-Xms20g 
-Xmx20g 
-XX:MaxPermSize=256m 

# These are our CMS settings
-XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=85 
-XX:+CMSParallelRemarkEnabled 
-XX:CMSMaxAbortablePrecleanTime=15000 

# trial and error found this to be the sweet spot for out 16-way machines.
-XX:ParallelGCThreads=8 

# YOu want these so you can see in your logs what is going on.  There are some 
tutorials on the web on how to make sense of verbose garbage collection.  
There's no problem using these in Production.
-verbose:gc 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps 

# Use this on a 64-bit machine unless your jvm is too old to support it (on by 
default on newer jvms, I think)
-XX:+UseCompressedOops

# we found these save a little memory
-XX:+UseStringCache 
-XX:+UseCompressedStrings 

-Original Message-
From: Bill Bell [mailto:billnb...@gmail.com] 
Sent: Saturday, June 30, 2012 8:49 PM
To: Bill Bell
Cc: dev@lucene.apache.org
Subject: Re: Low pause GC for java 1.6

Nothing?

Bill Bell
Sent from mobile

On Jun 29, 2012, at 9:09 PM, Bill Bell  wrote:

> We are getting large Solr pauses on Java garbage collection in 1.6 Java.
> 
> We have tried CMS. But we still have. 4 second wait on GC.
> 
> What works well for Solr when using 16 GB of RAM?
> 
> I have read lots of articles and now just looking for practical advise and 
> examples.
> 
> Sent from my Mobile device
> 720-256-8076

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Help: SOLR-3430 and Build changes

2012-05-02 Thread Dyer, James

I did a little digging on this and I'm not sure relying on JavaDB is such a 
sure bet.  Its a verbatim copy of Derby 10.2 and while bundled in with the jvm, 
its not in the classpath by default.  Also, I have 2 Oracle 1.6 JVMs on my PC 
and only 1 includes it.  Also, while the documentation says it is in the "db" 
directory, on my installation its in the "javadb" directory.  It would be 
tricky at best to reliably get this in the tester's classpath, I think.  It 
would be safer I think to just include the jar.

My thoughts were to eventually migrate the example to use derby instead of 
hsqldb.  Maybe I should either change my test to use hsqldb or change the 
example to use derby.  Then as Robert points out, its just a minor build 
modification to use the jar from the example.  In any case, the current Mock 
datasource doesn't emulate a real JDBC driver very well and I found it was 
extremely simple to use Derby in in-memory embedded mode (All you do is issue 
"DriverManager#getConnection" with the correct string).  There are no config 
files, etc.  

I don't know if you want to call this a "unit" test or an "integration" test 
(and what are all those other Solr tests that use Jetty, etc?).  In the end, I 
just want readable tests that are true to real life, which DIH lacks right now.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de] 
Sent: Wednesday, May 02, 2012 2:16 PM
To: dev@lucene.apache.org
Subject: RE: Help: SOLR-3430 and Build changes

I have not checked this, but if the JavaDB is in the JDK official JavaDocs and 
is therefore part of JDK6 spec? We have to check this, but *if* the package 
names start with java.db or whatever it *has* to be also in alternate JDK 
impls. At least OpenJDK also downloads derby while building.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Robert Muir [mailto:rcm...@gmail.com]
> Sent: Wednesday, May 02, 2012 8:42 PM
> To: dev@lucene.apache.org
> Subject: Re: Help: SOLR-3430 and Build changes
> 
> On Wed, May 2, 2012 at 1:51 PM, Uwe Schindler  wrote:
> > One note:
> >
> > Derby is included since JDK 6 as "JavaDB" together with the JDK:
> > http://www.oracle.com/technetwork/java/javadb/overview/index.html
> >
> > As Lucene/Solr 4 will be using JDK 6 as minimum requirement (in contrast) to
> Solr 3.x (which was JDK 5), can we not simply rely on this version shipped 
> with
> JDK? That would make life easy. And for simple tests that version should be
> enough...
> >
> 
> But we dont require *oracle*s implementation as a minimum requirement.
> we also support IBM etc too?
> 
> --
> lucidimagination.com
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Help: SOLR-3430 and Build changes

2012-05-02 Thread Dyer, James

Let me apologize in advance for my (almost) complete ignorance of everything 
build related:  Maven, Ivy, Ant, etc.  Sorry!

For Solr-3430, I am introducing a dependency to derby.jar, which will be needed 
only to run DIH tests.  So I don't want it included in the Solr .war.  It just 
needs to be in the classpath when junit runs.

1. Where should I put the .jar/licence/notics/sha1 files?
2. How do I modify the build so that it will be in the classpath for running 
tests only?
3. What do I need to do to get ivy and maven to pick it up?
4. I'll try my best to get the eclipse/intellij setup correct but I'm only able 
to test eclipse.

I really want to get this right so please give advice.  Thanks.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2314 - Failure

2012-04-23 Thread Dyer, James

This is a test bug.  I am committing a fix...

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] 
Sent: Monday, April 23, 2012 2:09 PM
To: dev@lucene.apache.org
Subject: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2314 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2314/

1 tests failed.
REGRESSION:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
Backup success not detected:  
0121,58 
KB/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk-java7/checkout/solr/build/solr-core/test/S0/org.apache.solr.handler.TestReplicationHandler$SolrInstance-1335207994266/master/data/index133520802160119_b.per_b_0.frq_b_nrm.cfe_b.fnm_b.fdt_b_nrm.cfssegments_j_b_0.tim_b.fdx_b_0.tiptruefalse133520802160119schema-replication2.xml:schema.xmlcommittrue19This 
response format is experimental.  It is likely to change in the future. 
 

Stack Trace:
java.lang.AssertionError: Backup success not detected:

0121,58 
KB/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk-java7/checkout/solr/build/solr-core/test/S0/org.apache.solr.handler.TestReplicationHandler$SolrInstance-1335207994266/master/data/index133520802160119_b.per_b_0.frq_b_nrm.cfe_b.fnm_b.fdt_b_nrm.cfssegments_j_b_0.tim_b.fdx_b_0.tiptruefalse133520802160119schema-replication2.xml:schema.xmlcommittrue19This 
response format is experimental.  It is likely to change in the future.


at 
__randomizedtesting.SeedInfo.seed([A93FC569246BDF7A:216BFAB38A97B282]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.handler.TestReplicationHandler.doTestBackup(TestReplicationHandler.java:895)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1913)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:131)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:805)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:866)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:880)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:760)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:682)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:615)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:654)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:812)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:131)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:668)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:687)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:723)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:734)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:604)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:131)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:550)




Build Log (for compile errors):
[...truncated 8789 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.ap

How do I document bugfixes applied to 3.6 branch?

2012-04-23 Thread Dyer, James

I would like to commit SOLR-3361 to 3.6 in case there is a 3.6.1.  Should I 
start a new 3.6.1 section in changes.txt?  Does it just go under 4.0 or 3.0 
with a note that it is in the 3.6 branch also (but not released)?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: bad jetty jars for 3.x?

2012-04-04 Thread Dyer, James

Looked a little more into it and found the issue is that 
"http://cloud.github.com/downloads/rmuir"; is blocked on the network I'm on.  Is 
hosting these on Robert's github space a permanent arrangement?   If so, I 
imagine I can get the jars from an old checkout and manually put them in the 
ivy repository?

Of course if the whole build didn't fail because the example couldn't be 
compiled that would be a nice plus.  For instance, I've been just trying to run 
the DIH tests, so I don't see why I need to care if "example" will compile.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Wednesday, April 04, 2012 3:32 PM
To: dev@lucene.apache.org
Subject: Re: bad jetty jars for 3.x?


: Whenever I try to run tests for 3.x I am getting problems with the jetty 
: jars for the solr example.  Before the checksums were added I was 
: getting an error reading the jar.  Now I get a bad checksum error.

sounds like it was corrupted when downloading?

try "ant clean-jars" and if that doesn't work then try removing it from 
your ivy cache and do "ant clean-jars" again

(we probably need to add info about this to the HowToContribute page)


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #449: POMs out of sync

2012-04-04 Thread Dyer, James

SOLR-3011 (3.x only bug fixes for DIH Threading) included some more-intense 
multi-threaded unit tests.  I fear that the bugs were not fully solved but 
occur on certain platforms/environments.  This is the second time this test has 
failed this week, both times during the Maven build.  I'd imagine the Maven 
tests run just so to trigger whatever is happening.

This feature is removed in Trunk, and the best solution might be to dial back 
the tests a little and increase the severity of the warning on the wiki about 
using "threads" with DIH. 

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu] 
Sent: Wednesday, April 04, 2012 2:46 PM
To: dev@lucene.apache.org
Subject: RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #449: POMs out of sync

This doesn't reproduce for me locally under Ant or under Maven. - Steve

-Original Message-
From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] 
Sent: Wednesday, April 04, 2012 3:13 PM
To: dev@lucene.apache.org
Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #449: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-3.x/449/

1 tests failed.
REGRESSION:  
org.apache.solr.handler.dataimport.TestThreaded.testCachedThread_FullImport

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:409)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:376)
at 
org.apache.solr.handler.dataimport.TestThreaded.verify(TestThreaded.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:630)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:536)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:457)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:508)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:146)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:36)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

bad jetty jars for 3.x?

2012-04-04 Thread Dyer, James

Whenever I try to run tests for 3.x I am getting problems with the jetty jars 
for the solr example.  Before the checksums were added I was getting an error 
reading the jar.  Now I get a bad checksum error.

[licenses] CHECKSUM FAILED for ... 
solr\example\lib\jetty-6.1.26-patched-JETTY-1340.jar (expected: 
"baa65a6f9940f2977fa152221522c0fce84d8c92" was: 
"d446a42a8399e30a8c6e8cfbfb135a6111ea689c")
[licenses] CHECKSUM FAILED for ... 
solr\example\lib\jetty-util-6.1.26-patched-JETTY-1340.jar (expected: 
"1cd718806c8f0baa318ea4a9c3a5e2f82e27f0e6" was: 
"186e4c23c58c0eb51342aec9cec92679d70f6c0c")

Any ideas what I can do?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #443: POMs out of sync

2012-03-31 Thread Dyer, James

I tried this seed on my 4-core Windows machine several times but no failure.  
This test failure might indicate that the DIH threading bugs aren't really 
fixed in 3.6.  On the other hand, users of DIH "threads" on 3.6 will get a 
deprecation warning, the wiki discourages it and the feature is gone in 4.0.  

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] 
Sent: Saturday, March 31, 2012 8:45 AM
To: dev@lucene.apache.org
Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #443: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-3.x/443/

1 tests failed.
REGRESSION:  
org.apache.solr.handler.dataimport.TestThreaded.testCachedThread_FullImport

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:409)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:376)
at 
org.apache.solr.handler.dataimport.TestThreaded.verify(TestThreaded.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:630)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:536)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:457)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:508)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:146)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:36)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at 
org.apache.maven.surefire.booter.SurefireStarter.invokeP

RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync

2012-03-21 Thread Dyer, James

This is confusing me because the non-Maven build with this commit (#12834 on 
jenkins) passed.  So this JVM has the Rhino JavaScript engine.

I guess the Maven build (for Trunk) is using a different 1.6 JRE than the 
non-Maven build?  One without Rhino?  Is there any way to use the same JRE?

In any case, let me add the "ignore" back in for this one exception.  Its 
unlikely to mask a real problem and it will let people who have 
non-rhino-equipped 1.6 JVMs to have the tests pass.

I'll commit this shortly.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-----
From: Dyer, James [mailto:james.d...@ingrambook.com]
Sent: Wednesday, March 21, 2012 3:40 PM
To: dev@lucene.apache.org
Subject: RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync

I'm looking at it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Apache Jenkins Server [mailto:jenk...@builds.apache.org]
Sent: Wednesday, March 21, 2012 3:35 PM
To: dev@lucene.apache.org
Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/432/

3 tests failed.
FAILED:  
org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript

Error Message:
Cannot load Script Engine for language: JavaScript

Stack Trace:
org.apache.solr.handler.dataimport.DataImportHandlerException: Cannot load 
Script Engine for language: JavaScript
at 
org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:76)
at 
org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:53)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:192)
at 
org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript(TestScriptTransformer.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule

RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync

2012-03-21 Thread Dyer, James

I'm looking at it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] 
Sent: Wednesday, March 21, 2012 3:35 PM
To: dev@lucene.apache.org
Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #432: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/432/

3 tests failed.
FAILED:  
org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript

Error Message:
Cannot load Script Engine for language: JavaScript

Stack Trace:
org.apache.solr.handler.dataimport.DataImportHandlerException: Cannot load 
Script Engine for language: JavaScript
at 
org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:76)
at 
org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:53)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:192)
at 
org.apache.solr.handler.dataimport.TestScriptTransformer.testCheckScript(TestScriptTransformer.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java

RE: I have my name on the staging site

2012-02-14 Thread Dyer, James

got it.   Also I updated a few links in the wiki, and put an obsolete notice 
(with a link) on the page about editing Forrest.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 

(wana update the instructions too?)


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

I have my name on the staging site

2012-02-13 Thread Dyer, James

I got my name on the staging site, but the instructions still have "TBD" for 
publishing to the real site.  Help?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: Welcome James Dyer

2012-02-10 Thread Dyer, James

A couple years ago I was told we were going to be adding tons of new docs to 
Search and we needed to do so low-cost.  Only problem, our search vendor 
licensed their product by the document, making the low-cost goal impossible.

I had seen Solr mentioned in the footnote of a book somewhere and thought maybe 
it was worth looking into.  What I didn't realize is that switching to Solr 
would mean better performance, cheaper hardware, easier configuration and a lot 
more flexibility.

Better yet, it turned out we wouldn't lose functionality with the switch.  I 
just needed to apply a few patches.  But then there were just a few little 
things I couldn't find a patch for, so I subscribed to the dev-list and started 
doing what I could.  All I can say is working on open source is so much better 
than calling the vendor and having them say "nope, it can't do that.  But we 
can file a feature request for you."  

The work you all do on this project is truly amazing.  Thank you for letting me 
have a little bigger part in it as well.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Grant Ingersoll [mailto:gsing...@apache.org] 
Sent: Friday, February 10, 2012 7:08 AM
To: dev@lucene.apache.org
Subject: Welcome James Dyer

I'm pleased to announce the PMC has elected James Dyer to be a committer on the 
project and he has agreed to join.  

Welcome aboard, James!

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

SolrCore.java imports "org.eclipse.jdt.core.dom.ThisExpression"

2011-11-10 Thread Dyer, James

I'm wondering if the import for "org.eclipse.jdt.core.dom.ThisExpression" in 
SolrCore.java introduced in r1196797 (SOLR-2861) was a mistake.  It adds an 
additional .jar dependency and doesn't seem to be used.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: [VOTE] Release Lucene/Solr 3.2.0

2011-05-27 Thread Dyer, James

Michael,

A while ago I submitted SOLR-2462 with a patch to fix a critical bug in Solr's 
spellchecker.  I'm not sure if the patch included is the best approach to 
fixing the problem but I do think any next release should include a fix for 
this.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com] 
Sent: Friday, May 27, 2011 10:50 AM
To: dev@lucene.apache.org Dev
Subject: [VOTE] Release Lucene/Solr 3.2.0

Please vote to release the artifacts at:

  http://people.apache.org/~mikemccand/lucene_solr_320/rc1/

as Lucene 3.2.0 and Solr 3.2.0.

Mike

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Need some DIH Entity Processor development advice...

2010-10-25 Thread Dyer, James

We have a situation where we have data coming from several long-running queries 
hitting multiple relational databases.  Other data comes in fixed-width text 
file feeds, etc.  All of this has to be joined and denormalized and made into 
nice SOLR documents.  I've been wanting to use DIH as it seems to already 
provide 90% of what we need.  The rest can some in the form of custom 
transformers & Entity Processors that I can write...

One big need is to have disk-backed caches.  For instance, a child entity that 
pulls back millions of rows will beat up the db using a regular 
SQLEntityProcessor whereas the CachedSQLEntityProcessor puts everything in 
memory in a HashMap so it will only scale to a point.  For fixed-width text 
files, there doesn't seem to be any Cached implementations at all.

So I've written a custom Entity Processor that creates a temporary Lucene index 
to use as a disk cache.  Initial tests are promising but with one little 
problem.  I need a place to close the Lucene index reader and then delete the 
temporary index.  It seemed easy enough to override the "destroy()" method from 
EntityProcessorBase.  But to my surprise, it seems that both destroy() and 
init() get called every time a new Primary Key is called up from the cache.  
(see DocBuilder.buildDocument()).  Just to be sure I wasn't crazy, I added a 
"destroy()" method to CachedSqlEntityProcessor and found it indeed gets called 
every time a new Primary Key is called from the cache.  In fact, the first 
couple of lines in cacheInit() in EntityProcessorBase seem to be there to cope 
with the fact that both destroy() and init() get called over and over again 
during the lifecycle of the object.

I've also noticed that destroy() isn't actually implemented anywhere in the 
prepacked Entity Processors.  This makes me wonder if it is a mistake.  Should 
DocBuilder be changed to call destroy() only once per lifecycle for each 
EntityProcessor object?  If so I think I can have a patch in JIRA in short 
order.

Otherwise...How do I best accomplish my clean-up tasks?  Advice is greatly 
appreciated.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

RE: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality

2010-10-15 Thread Dyer, James

We were working with 2 versions of the functionality and decided that the one 
that doesn't require modifications to SearchHandler and ResponseBuilder would 
perform better in most sitations.  So you won't see any changes to these 2 
classes.  The functionality should work.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: JAYABAALAN V (JIRA) [mailto:j...@apache.org] 
Sent: Friday, October 15, 2010 2:36 AM
To: dev@lucene.apache.org
Subject: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent 
Collate functionality


[ 
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921264#action_12921264
 ] 

JAYABAALAN V commented on SOLR-2010:


I am able to download only these four java class under the revision 1021439 
SpellCheckComponent,SpellCheckResponse,SpellingParams,TestSpellCheckResponse 
and other java class are not updated ResponseBuilder.java, and 
SearchHandler.java

Let me know the correct path for these two java classes for revision 1021439


> Improvements to SpellCheckComponent Collate functionality
> -
>
> Key: SOLR-2010
> URL: https://issues.apache.org/jira/browse/SOLR-2010
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java, spellchecker
>Affects Versions: 1.4.1
> Environment: Tested against trunk revision 966633
>Reporter: James Dyer
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.patch, 
> SOLR-2010.patch, SOLR-2010.txt, SOLR-2010_141.patch, 
> SOLR-2010_shardRecombineCollations_993538.patch, 
> SOLR-2010_shardRecombineCollations_999521.patch, 
> SOLR-2010_shardSearchHandler_993538.patch, 
> SOLR-2010_shardSearchHandler_999521.patch
>
>
> Improvements to SpellCheckComponent Collate functionality
> Our project requires a better Spell Check Collator.  I'm contributing this as 
> a patch to get suggestions for improvements and in case there is a broader 
> need for these features.
> 1. Only return collations that are guaranteed to result in hits if re-queried 
> (applying original fq params also).  This is especially helpful when there is 
> more than one correction per query.  The 1.4 behavior does not verify that a 
> particular combination will actually return hits.
> 2. Provide the option to get multiple collation suggestions
> 3. Provide extended collation results including the # of hits re-querying 
> will return and a breakdown of each misspelled word and its correction.
> This patch is similar to what is described in SOLR-507 item #1.  Also, this 
> patch provides a viable workaround for the problem discussed in SOLR-1074.  A 
> dictionary could be created that combines the terms from the multiple fields. 
>  The collator then would prune out any spurious suggestions this would cause.
> This patch adds the following spellcheck parameters:
> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try 
> before giving up.  Lower values ensure better performance.  Higher values may 
> be necessary to find a collation that can return results.  Default is 0, 
> which maintains backwards-compatible behavior (do not check collations).
> 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 
> 1, which maintains backwards-compatible behavior.
> 3. spellcheck.collateExtendedResult - if true, returns an expanded response 
> format detailing collations found.  default is false, which maintains 
> backwards-compatible behavior.  When true, output is like this (in context):
> 
>   
>   
>   94
>   7
>   11
>   
>   hope
>   how
>   hope
>   chops
>   hoped
>   etc
>   
>   
>   100
>   16
>   21
>   
>   fall
>   fails
>   fail
>   fill
>   faith
>   all
>   etc
>   
>   
>   
>   Title:(how AND fails)
>   2
>   
>   how
>   fails
>   
>   
>   
>   Title:(hope AND faith)
>   2
>   
>

RE: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality

2010-08-13 Thread Dyer, James

Grant,

I saw your comment and I agree its probably best to somehow re-query
through a Search Handler, either the existing one with all other
components turned off, or through a new one just for this purpose.  If
you (or someone else) are not able to work on implementing it this way
then I can probably get a little time in a few weeks.   

James Dyer
E-Commerce Systems
Ingram Book Company
(615) 213-4311

-Original Message-
From: Grant Ingersoll [mailto:gsing...@apache.org] 
Sent: Friday, August 13, 2010 7:34 AM
To: dev@lucene.apache.org
Subject: Re: [jira] Commented: (SOLR-2010) Improvements to
SpellCheckComponent Collate functionality

Hi James,

Did you see my comments on the issue?  

On Aug 11, 2010, at 12:28 AM, Dyer, James wrote:

> Tom,
> 
> I'm going to also need this to work with 1.4.1 within the next month
or two so if someone else doesn't back-port it to 1.4.1 then I probably
will.  I also would like to see this working with shards.  The
PossibilityIterator class likely can be made a lot simpler.  If nobody
else takes care of these items I will try to find time to do so myself
prior to making it work with 1.4.1.
> 
> James Dyer
> E-Commerce Systems
> Ingram Book Company
> (615) 213-4311
> 
> -Original Message-
> From: Tom Phethean (JIRA) [mailto:j...@apache.org] 
> Sent: Tuesday, August 10, 2010 10:01 AM
> To: dev@lucene.apache.org
> Subject: [jira] Commented: (SOLR-2010) Improvements to
SpellCheckComponent Collate functionality
> 
> 
>[
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.
plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896903#
action_12896903 ] 
> 
> Tom Phethean commented on SOLR-2010:
> 
> 
> Ok, thanks. Do you know if there is a rough timescale on that?
> 
>> Improvements to SpellCheckComponent Collate functionality
>> -
>> 
>>Key: SOLR-2010
>>URL: https://issues.apache.org/jira/browse/SOLR-2010
>>Project: Solr
>> Issue Type: New Feature
>> Components: clients - java, spellchecker
>>   Affects Versions: 1.4.1
>>Environment: Tested against trunk revision 966633
>>   Reporter: James Dyer
>>   Assignee: Grant Ingersoll
>>   Priority: Minor
>>Attachments: SOLR-2010.patch, SOLR-2010.patch
>> 
>> 
>> Improvements to SpellCheckComponent Collate functionality
>> Our project requires a better Spell Check Collator.  I'm contributing
this as a patch to get suggestions for improvements and in case there is
a broader need for these features.
>> 1. Only return collations that are guaranteed to result in hits if
re-queried (applying original fq params also).  This is especially
helpful when there is more than one correction per query.  The 1.4
behavior does not verify that a particular combination will actually
return hits.
>> 2. Provide the option to get multiple collation suggestions
>> 3. Provide extended collation results including the # of hits
re-querying will return and a breakdown of each misspelled word and its
correction.
>> This patch is similar to what is described in SOLR-507 item #1.
Also, this patch provides a viable workaround for the problem discussed
in SOLR-1074.  A dictionary could be created that combines the terms
from the multiple fields.  The collator then would prune out any
spurious suggestions this would cause.
>> This patch adds the following spellcheck parameters:
>> 1. spellcheck.maxCollationTries - maximum # of collation
possibilities to try before giving up.  Lower values ensure better
performance.  Higher values may be necessary to find a collation that
can return results.  Default is 0, which maintains backwards-compatible
behavior (do not check collations).
>> 2. spellcheck.maxCollations - maximum # of collations to return.
Default is 1, which maintains backwards-compatible behavior.
>> 3. spellcheck.collateExtendedResult - if true, returns an expanded
response format detailing collations found.  default is false, which
maintains backwards-compatible behavior.  When true, output is like this
(in context):
>> 
>>  
>>  
>>  94
>>  7
>>  11
>>  
>>  hope
>>  how
>>  hope
>>  chops
>>  hoped
>>  etc
>>  
>>  
>>  1

RE: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality

2010-08-10 Thread Dyer, James

Tom,

I'm going to also need this to work with 1.4.1 within the next month or two so 
if someone else doesn't back-port it to 1.4.1 then I probably will.  I also 
would like to see this working with shards.  The PossibilityIterator class 
likely can be made a lot simpler.  If nobody else takes care of these items I 
will try to find time to do so myself prior to making it work with 1.4.1.

James Dyer
E-Commerce Systems
Ingram Book Company
(615) 213-4311

-Original Message-
From: Tom Phethean (JIRA) [mailto:j...@apache.org] 
Sent: Tuesday, August 10, 2010 10:01 AM
To: dev@lucene.apache.org
Subject: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent 
Collate functionality


[ 
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896903#action_12896903
 ] 

Tom Phethean commented on SOLR-2010:


Ok, thanks. Do you know if there is a rough timescale on that?

> Improvements to SpellCheckComponent Collate functionality
> -
>
> Key: SOLR-2010
> URL: https://issues.apache.org/jira/browse/SOLR-2010
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java, spellchecker
>Affects Versions: 1.4.1
> Environment: Tested against trunk revision 966633
>Reporter: James Dyer
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-2010.patch, SOLR-2010.patch
>
>
> Improvements to SpellCheckComponent Collate functionality
> Our project requires a better Spell Check Collator.  I'm contributing this as 
> a patch to get suggestions for improvements and in case there is a broader 
> need for these features.
> 1. Only return collations that are guaranteed to result in hits if re-queried 
> (applying original fq params also).  This is especially helpful when there is 
> more than one correction per query.  The 1.4 behavior does not verify that a 
> particular combination will actually return hits.
> 2. Provide the option to get multiple collation suggestions
> 3. Provide extended collation results including the # of hits re-querying 
> will return and a breakdown of each misspelled word and its correction.
> This patch is similar to what is described in SOLR-507 item #1.  Also, this 
> patch provides a viable workaround for the problem discussed in SOLR-1074.  A 
> dictionary could be created that combines the terms from the multiple fields. 
>  The collator then would prune out any spurious suggestions this would cause.
> This patch adds the following spellcheck parameters:
> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try 
> before giving up.  Lower values ensure better performance.  Higher values may 
> be necessary to find a collation that can return results.  Default is 0, 
> which maintains backwards-compatible behavior (do not check collations).
> 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 
> 1, which maintains backwards-compatible behavior.
> 3. spellcheck.collateExtendedResult - if true, returns an expanded response 
> format detailing collations found.  default is false, which maintains 
> backwards-compatible behavior.  When true, output is like this (in context):
> 
>   
>   
>   94
>   7
>   11
>   
>   hope
>   how
>   hope
>   chops
>   hoped
>   etc
>   
>   
>   100
>   16
>   21
>   
>   fall
>   fails
>   fail
>   fill
>   faith
>   all
>   etc
>   
>   
>   
>   Title:(how AND fails)
>   2
>   
>   how
>   fails
>   
>   
>   
>   Title:(hope AND faith)
>   2
>   
>   hope
>   faith
>   
>   
>   
>   Title:(chops AND all)
>   1
>   
>   chops
>   all
>   
>   
>   
> 
> In addition, SOLRJ is updated to include 
> SpellCheckResponse.getCollatedR

63 matches

Mail list logo