[jira] [Commented] (SOLR-6907) URLEncode documents directory in MorphlineMapperTest

2015-01-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263599#comment-14263599 ] wolfgang hoschek commented on SOLR-6907: +1 Looks reasonable to me. URLEncode

[jira] [Commented] (SOLR-4509) Disable HttpClient stale check for performance and fewer spurious connection errors.

2014-11-25 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224815#comment-14224815 ] wolfgang hoschek commented on SOLR-4509: Would be good to remove that stale check

[jira] [Commented] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4

2014-06-29 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047223#comment-14047223 ] wolfgang hoschek commented on SOLR-6212: This is already fixed in the latest stable

[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2014-06-29 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047391#comment-14047391 ] wolfgang hoschek commented on SOLR-5109: FWIW, morphlines currently won't work

[jira] [Comment Edited] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2014-06-29 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047394#comment-14047394 ] wolfgang hoschek edited comment on SOLR-5109 at 6/30/14 5:36 AM

[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2014-06-29 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047394#comment-14047394 ] wolfgang hoschek commented on SOLR-5109: Another potential issue is that hadoop

Re: Adding Morphline support to DIH - worth the effort?

2014-06-11 Thread Wolfgang Hoschek
From our perspective we don’t really see use cases for DIH anymore. Morphlines was developed primarily with Lucene in mind (even though it doesn’t require Lucene). Flume Morphline Solr Sink handles streaming ingestion into Solr in reliable, scalable, flexible and loosely coupled ways, in

[jira] [Commented] (SOLR-6126) MapReduce's GoLive script should support replicas

2014-06-02 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-6126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14015266#comment-14015266 ] wolfgang hoschek commented on SOLR-6126: [~dsmiley] It uses the --zk-host CLI

[jira] [Commented] (SOLR-6126) MapReduce's GoLive script should support replicas

2014-06-01 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-6126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14015092#comment-14015092 ] wolfgang hoschek commented on SOLR-6126: The comment in the code is a bit outdated

[jira] [Commented] (SOLR-5848) Morphlines is not resolving

2014-03-12 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932328#comment-13932328 ] wolfgang hoschek commented on SOLR-5848: Going forward I'd recommend upgrading

[jira] [Commented] (SOLR-5848) Morphlines is not resolving

2014-03-12 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932378#comment-13932378 ] wolfgang hoschek commented on SOLR-5848: Sounds good. Thx! Morphlines

[jira] [Created] (SOLR-5786) MapReduceIndexerTool --help text is missing large parts of the help text

2014-02-27 Thread wolfgang hoschek (JIRA)
wolfgang hoschek created SOLR-5786: -- Summary: MapReduceIndexerTool --help text is missing large parts of the help text Key: SOLR-5786 URL: https://issues.apache.org/jira/browse/SOLR-5786 Project

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-02-27 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914549#comment-13914549 ] wolfgang hoschek commented on SOLR-5605: Correspondingly, I filed https

[jira] [Updated] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text

2014-02-27 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wolfgang hoschek updated SOLR-5786: --- Summary: MapReduceIndexerTool --help output is missing large parts of the help text

[jira] [Updated] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text

2014-02-27 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wolfgang hoschek updated SOLR-5786: --- Description: As already mentioned repeatedly and at length, this is a regression introduced

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-02-27 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915037#comment-13915037 ] wolfgang hoschek commented on SOLR-5605: bq. Are you not a committer? At Apache

[jira] [Comment Edited] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-02-27 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915037#comment-13915037 ] wolfgang hoschek edited comment on SOLR-5605 at 2/27/14 9:23 PM

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-02-25 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911744#comment-13911744 ] wolfgang hoschek commented on SOLR-5605: I have looked, have you? I have fixed

[jira] [Reopened] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-02-19 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wolfgang hoschek reopened SOLR-5605: Without this the --help text is screwed. https://issues.apache.org/jira/secure/EditComment

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-02-19 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905806#comment-13905806 ] wolfgang hoschek commented on SOLR-5605: Yes, as already mentioned, otherwise some

Re: Welcome Benson Margulies as Lucene/Solr committer!

2014-01-28 Thread Wolfgang Hoschek
Welcome on board! Wolfgang. On Jan 26, 2014, at 4:32 PM, Erick Erickson wrote: Good to have you aboard! Erick On Sat, Jan 25, 2014 at 10:52 PM, Mark Miller markrmil...@gmail.com wrote: Welcome! - Mark http://about.me/markrmiller On Jan 25, 2014, at 4:40 PM, Michael McCandless

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-01-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862272#comment-13862272 ] wolfgang hoschek commented on SOLR-5605: Thanks for getting to the bottom

[jira] [Comment Edited] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

2014-01-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862272#comment-13862272 ] wolfgang hoschek edited comment on SOLR-5605 at 1/4/14 11:42 AM

[jira] [Commented] (SOLR-5584) Update to Guava 15.0

2014-01-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862273#comment-13862273 ] wolfgang hoschek commented on SOLR-5584: As mentioned above, morphlines

Re: The Old Git Discussion

2014-01-02 Thread Wolfgang Hoschek
+1 On Jan 2, 2014, at 10:53 PM, Simon Willnauer wrote: +1 On Thu, Jan 2, 2014 at 9:51 PM, Mark Miller markrmil...@gmail.com wrote: bzr is dying; Emacs needs to move http://lists.gnu.org/archive/html/emacs-devel/2014-01/msg5.html Interesting thread. For similar reasons, I

[jira] [Commented] (SOLR-5584) Update to Guava 15.0

2013-12-30 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858699#comment-13858699 ] wolfgang hoschek commented on SOLR-5584: What exactly is failing for you

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-25 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856657#comment-13856657 ] wolfgang hoschek commented on SOLR-1301: Also see https://issues.cloudera.org

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-15 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848097#comment-13848097 ] wolfgang hoschek edited comment on SOLR-1301 at 12/16/13 2:27 AM

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-15 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848775#comment-13848775 ] wolfgang hoschek commented on SOLR-1301: bq. it would be convenient if we could

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-13 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848097#comment-13848097 ] wolfgang hoschek commented on SOLR-1301: Might be best to write a program

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843443#comment-13843443 ] wolfgang hoschek commented on SOLR-1301: I'm not aware of anything needing jersey

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843443#comment-13843443 ] wolfgang hoschek edited comment on SOLR-1301 at 12/9/13 7:30 PM

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843523#comment-13843523 ] wolfgang hoschek commented on SOLR-1301: Apologies for the confusion. We

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-06 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842034#comment-13842034 ] wolfgang hoschek commented on SOLR-1301: There are also some important fixes

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-06 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842034#comment-13842034 ] wolfgang hoschek edited comment on SOLR-1301 at 12/7/13 2:57 AM

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839308#comment-13839308 ] wolfgang hoschek commented on SOLR-1301: There are also some fixes downstream

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839311#comment-13839311 ] wolfgang hoschek commented on SOLR-1301: Minor nit: could remove

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839556#comment-13839556 ] wolfgang hoschek commented on SOLR-1301: FWIW, a current printout of --help showing

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-04 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839556#comment-13839556 ] wolfgang hoschek edited comment on SOLR-1301 at 12/5/13 12:55 AM

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-03 Thread Wolfgang Hoschek
On Dec 3, 2013, at 12:11 AM, Uwe Schindler wrote: Looks like Java's service loader lookup impl has become more strict in Java8. This issue on Java 8 is kind of unfortunate because morphlines and solr-mr doesn't actually use JAXP at all. For the time being might be best to disable testing

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-03 Thread Wolfgang Hoschek
FYI, I filed this saxon ticket: https://saxonica.plan.io/issues/1944 On Dec 3, 2013, at 12:52 AM, Wolfgang Hoschek wrote: On Dec 3, 2013, at 12:11 AM, Uwe Schindler wrote: Looks like Java's service loader lookup impl has become more strict in Java8. This issue on Java 8 is kind

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-03 Thread Wolfgang Hoschek
opinion on the subject in that stack overflow topic... :) Dawid On Tue, Dec 3, 2013 at 9:52 AM, Wolfgang Hoschek whosc...@cloudera.com wrote: On Dec 3, 2013, at 12:11 AM, Uwe Schindler wrote: Looks like Java's service loader lookup impl has become more strict in Java8. This issue

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837976#comment-13837976 ] wolfgang hoschek commented on SOLR-1301: bq. module/dir names I propose morphlines

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837979#comment-13837979 ] wolfgang hoschek commented on SOLR-1301: +1 to map-reduce-indexer module name/dir

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-03 Thread Wolfgang Hoschek
Subject: Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing! Ha! Thanks for filing the issue, Wolfgang. D. On Tue, Dec 3, 2013 at 12:01 PM, Wolfgang Hoschek whosc...@cloudera.com wrote: Actually, Mike's opinion has changed because now Saxon

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837976#comment-13837976 ] wolfgang hoschek edited comment on SOLR-1301 at 12/3/13 6:40 PM

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838054#comment-13838054 ] wolfgang hoschek commented on SOLR-1301: bq. The problem with these two names

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838064#comment-13838064 ] wolfgang hoschek commented on SOLR-1301: +1 on Steve's suggestion as well. Thanks

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838305#comment-13838305 ] wolfgang hoschek edited comment on SOLR-1301 at 12/3/13 11:11 PM

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-03 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838305#comment-13838305 ] wolfgang hoschek commented on SOLR-1301: Upon a bit more reflection might be better

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-02 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837068#comment-13837068 ] wolfgang hoschek commented on SOLR-1301: There is also a known issue

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b117) - Build # 8549 - Still Failing!

2013-12-02 Thread Wolfgang Hoschek
Looks like Java's service loader lookup impl has become more strict in Java8. This issue on Java 8 is kind of unfortunate because morphlines and solr-mr doesn't actually use JAXP at all. For the time being might be best to disable testing on Java8 for this contrib, in order to get a stable

Re: Welcome Joel Bernstein

2013-10-04 Thread Wolfgang Hoschek
Welcome Joel! Wolfgang. On Oct 3, 2013, at 9:56 AM, Erick Erickson wrote: Welcome Joel! On Thu, Oct 3, 2013 at 9:33 AM, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: Welcome Joel! On 3 October 2013 15:45, Shawn Heisey s...@elyograg.org wrote: On 10/2/2013 11:24 PM,

Re: Welcome back, Wolfgang Hoschek!

2013-09-26 Thread Wolfgang Hoschek
Thanks to all! Looking forward to more contributions. Wolfgang. On Sep 26, 2013, at 3:21 AM, Uwe Schindler wrote: Hi, I'm pleased to announce that after a long abstinence, Wolfgang Hoschek rejoined the Lucene/Solr committer team. He is working now at Cloudera and plans to help

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-16 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768629#comment-13768629 ] wolfgang hoschek commented on SOLR-1301: cdk-morphlines-solr-core and cdk

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-16 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768662#comment-13768662 ] wolfgang hoschek commented on SOLR-1301: Seems like the patch still misses tika-xmp

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-10 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763618#comment-13763618 ] wolfgang hoschek commented on SOLR-1301: FYI, One things that's definitely off

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-10 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763636#comment-13763636 ] wolfgang hoschek commented on SOLR-1301: By the way, docs and the downstream code

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-10 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763644#comment-13763644 ] wolfgang hoschek commented on SOLR-1301: This new solr-mr contrib uses morphlines

[jira] [Commented] (LUCENE-4661) Reduce default maxMerge/ThreadCount for ConcurrentMergeScheduler

2013-01-08 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547367#comment-13547367 ] wolfgang hoschek commented on LUCENE-4661: -- Might be good to experiment

Re: [jira] Field constructor, avoiding String.intern()

2007-02-23 Thread Wolfgang Hoschek
, the performance gain of using equals() on interned strings is no match for the performance loss of interning the field name of each field. Wolfgang Hoschek-2 wrote: I noticed that, too, but in my case the difference was often much more extreme: it was one of the primary bottlenecks

Re: [jira] Commented: (LUCENE-794) Beginnings of a span based highlighter

2007-02-05 Thread Wolfgang Hoschek
I need to read the TokenStream at least twice I used the horribly hackey but quick-for-me method of adding a method to MemoryIndex that accepts a List of Tokens. Any ideas? I'm not sure about modifying MemoryIndex. It should be easy enough to create a subclass of TokenStream -

[jira] Commented: (LUCENE-129) Finalizers are non-canonical

2007-01-05 Thread wolfgang hoschek (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462579 ] wolfgang hoschek commented on LUCENE-129: - Just to clarify: The empty finalize() method body in MemoryIndex

[jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index

2006-11-21 Thread wolfgang hoschek (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12451817 ] wolfgang hoschek commented on LUCENE-550: - All Lucene unit tests have been adapted to work with my alternate index. Everything but proximity queries

[jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index

2006-11-21 Thread wolfgang hoschek (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12451768 ] wolfgang hoschek commented on LUCENE-550: - Ok. That means a basic test passes. For some more exhaustive tests, run all the queries in src/test/org

[jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index

2006-11-21 Thread wolfgang hoschek (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12451731 ] wolfgang hoschek commented on LUCENE-550: - Other question: when running the driver in test mode (checking for equality of query results against

[jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index

2006-11-21 Thread wolfgang hoschek (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12451730 ] wolfgang hoschek commented on LUCENE-550: - What's the benchmark configuration? For example, is throughput bounded by indexing or querying? Measuring N

Re: MemoryIndex

2006-05-02 Thread Wolfgang Hoschek
MemoryIndex was designed to maximize performance for a specific use case: pure in-memory datastructure, at most one document per MemoryIndex instance, any number of fields, high frequency reads, high frequency index writes, no thread-safety required, optional support for storing offsets.

Re: Optimizing/minimizing memory usage of memory-based indexes

2006-02-11 Thread Wolfgang Hoschek
Initially it might, but probably eventually not. I was thinking Lucene formats might also be bit more compact than vanilla hash maps, but I guess that depends on many factors. But I will probably want to play with actual queries later on, based on frequencies. OK. In the latter case, are

Re: Advanced query language

2005-12-18 Thread Wolfgang Hoschek
On Dec 17, 2005, at 2:36 PM, Paul Elschot wrote: Gentlemen, While maintaining my bookmarks I ran into this: Case Study: Enabling Low-Cost XML-Aware Searching Capable of Complex Querying: http://www.idealliance.org/papers/xmle02/dx_xmle02/papers/ 03-02-08/03-02-08.html Some loose thoughts:

Re: Advanced query language

2005-12-17 Thread Wolfgang Hoschek
done with extension to this code. Regards, Paul Elschot On Friday 16 December 2005 03:45, Wolfgang Hoschek wrote: I think implementing an XQuery Full-Text engine is far beyond the scope of Lucene. Implementing a building block for the fulltext aspect of it would be more manageable

Re: Advanced query language

2005-12-15 Thread Wolfgang Hoschek
I think implementing an XQuery Full-Text engine is far beyond the scope of Lucene. Implementing a building block for the fulltext aspect of it would be more manageable. Unfortunately The W3C fulltext drafts indiscriminately mix and mingle two completely different languages into a single

Re: Advanced query language

2005-12-15 Thread Wolfgang Hoschek
in Java 6, but that doesn't help too much given the Java 1.4 req. -Yonik On 12/15/05, Wolfgang Hoschek [EMAIL PROTECTED] wrote: STAX would probably make coding easier, but unfortunately complicates the packaging side: one must ship at least two additional external jars (stax interfaces and impl

Re: Advanced query language

2005-12-06 Thread Wolfgang Hoschek
That's basically what I'm implementing with Nux, except that the syntax and calling conventions are a bit different, and that Lucene analyzers can optionally be specified, which makes it a lot more powerful (but also a bit more complicated). Wolfgang. On Dec 6, 2005, at 10:48 AM, Incze

Re: Advanced query language

2005-12-05 Thread Wolfgang Hoschek
Hopefully that makes sense to someone besides just me. It's certainly a lot more complexity then a simple one to one mapping, but it seems to me like the flexability is worth spending the extra time to design/ build it. Makes perfect sense to me, and it doesn't seem any more complex

Re: Advanced query language

2005-12-05 Thread Wolfgang Hoschek
Hopefully that makes sense to someone besides just me. It's certainly a lot more complexity then a simple one to one mapping, but it seems to me like the flexability is worth spending the extra time to design/ build it. Makes perfect sense to me, and it doesn't seem any more complex

Re: open source YourKit licence

2005-12-02 Thread Wolfgang Hoschek
Yonik, I haven't been terribly active lately, but I've been voted in as committer as well... :-) http://marc.theaimsgroup.com/?l=lucene-devw=2r=1s=hoschek +committerq=b Cheers, Wolfgang. On Dec 2, 2005, at 2:53 PM, Yonik Seeley wrote: ~yonik/yourkit/

Re: Lucene does NOT use UTF-8.

2005-08-31 Thread Wolfgang Hoschek
On Aug 30, 2005, at 12:47 PM, Doug Cutting wrote: Yonik Seeley wrote: I've been looking around... do you have a pointer to the source where just the suffix is converted from UTF-8? I understand the index format, but I'm not sure I understand the problem that would be posed by the prefix

[ANN] Nux-1.3 released

2005-08-03 Thread Wolfgang Hoschek
The Nux-1.3 release has been uploaded to http://dsd.lbl.gov/nux/ Nux is an open-source Java toolkit making efficient and powerful XML processing easy. Changelog: •Upgraded to saxonb-8.5 (saxon-8.4 and 8.3 should continue to work as well). •Upgraded to xom-1.1-rc1

Re: Analyzer as an Interface?

2005-07-19 Thread Wolfgang Hoschek
On Jul 19, 2005, at 12:58 PM, Daniel Naber wrote: Hi, currently Analyzer is an abstract class. Shouldn't we make it an Interface? Currently that's not possible, but it will be as soon as the deprecated method is removed (i.e. after Lucene 1.9). Regards Daniel Daniel, what's the use

Re: Lucene vs. Ruby/Odeum

2005-06-02 Thread Wolfgang Hoschek
poor java startup time For the one's really keen on reducing startup time the Jolt Java VM daemon may perhaps be of some interest: http://www.dystance.net/software/jolt/index.html I played with it a year ago when I was curious to see what could be done about startup time in the context of

Re: Lucene vs. Ruby/Odeum

2005-06-01 Thread Wolfgang Hoschek
As an aside, in my performance testing of Lucene using JProfiler, it seems to me that the only way to improve Lucene's performance greatly can come from 2 areas 1. optimizing the JVM array/looping/JIT constructs/capabilities to avoid bounds checking/improve performance 2. improve function

Re: contrib/queryParsers/surround

2005-05-28 Thread Wolfgang Hoschek
Cool stuff. Once this has stabilized and settled down I might start exposing the surround language from XQuery/XPath as an experimental match facility. Wolfgang. On May 28, 2005, at 10:07 AM, Paul Elschot wrote: On Saturday 28 May 2005 17:06, Erik Hatcher wrote: On May 28, 2005,

[ANN] nux-1.2 release

2005-05-25 Thread Wolfgang Hoschek
The nux-1.2 release has been uploaded to http://dsd.lbl.gov/nux/ Nux is an open-source Java XML toolset geared towards embedded use in high-throughput XML messaging middleware such as large-scale Peer-to- Peer infrastructures, message queues, publish-subscribe and matchmaking systems

Add Term.createTerm to avoid 99% of String.intern() calls

2005-05-18 Thread Wolfgang Hoschek
For the MemoryIndex, I'm seeing large performance overheads due to repetitive temporary string interning of o.a.l.index.Term. For example, consider a FuzzyTermQuery or similar, scanning all terms via TermEnum in the index: 40% of the time is spent in String.intern () of new Term().

Re: Lucene vs. Ruby/Odeum

2005-05-17 Thread Wolfgang Hoschek
Right. One doesn't need to run those benchmarks to immediately see that most time is spent in VM startup, class loading, hotspot compilation rather than anything Lucene related. Even a simple System.out.println(hello) typically takes some 0.3 secs on a fast box and JVM. Wolfgang. On May

Re: [Performance] Streaming main memory indexing of single strings

2005-05-03 Thread Wolfgang Hoschek
Here's a performance patch for MemoryIndex.MemoryIndexReader that caches the norms for a given field, avoiding repeated recomputation of the norms. Recall that, depending on the query, norms() can be called over and over again with mostly the same parameters. Thus, replace public byte[]

contrib: keywordTokenStream

2005-05-03 Thread Wolfgang Hoschek
Here's a convenience add-on method to MemoryIndex. If it turns out that this could be of wider use, it could be moved into the core analysis package. For the moment the MemoryIndex might be a better home. Opinions, anyone? Wolfgang. /** * Convenience method; Creates and returns a token

Re: contrib: keywordTokenStream

2005-05-03 Thread Wolfgang Hoschek
On May 3, 2005, at 5:26 PM, Erik Hatcher wrote: Wolfgang, I've now added this. Thanks :-) I'm not seeing how this could be generally useful. I'm curious how you are using it and why it is better suited for what you're doing than any other analyzer. keyword tokenizer is a bit overloaded

Re: [Performance] Streaming main memory indexing of single strings

2005-05-02 Thread Wolfgang Hoschek
I'm looking at it right now. The tests pass fine when you put lucene-1.4.3.jar instead of the current lucene onto the classpath which is what I've been doing so far. Something seems to have changed in the scoring calculation. No idea what that might be. I'll see if I can find out.

Re: [Performance] Streaming main memory indexing of single strings

2005-05-02 Thread Wolfgang Hoschek
calculation now be done differently? If so, how? Thanks for any clues into the right direction. Wolfgang. On May 2, 2005, at 9:05 AM, Wolfgang Hoschek wrote: I'm looking at it right now. The tests pass fine when you put lucene-1.4.3.jar instead of the current lucene onto the classpath which

Re: [Performance] Streaming main memory indexing of single strings

2005-05-02 Thread Wolfgang Hoschek
Yes, the svn trunk uses skipTo more often than 1.4.3. However, your implementation of skipTo() needs some improvement. See the javadoc of skipTo of class Scorer: http://lucene.apache.org/java/docs/api/org/apache/lucene/search/ Scorer.html#skipTo(int) What's wrong with the version I sent? Remeber

Re: [Performance] Streaming main memory indexing of single strings

2005-05-02 Thread Wolfgang Hoschek
The version I sent returns in O(1), if performance was your concern. Or did you mean something else? Since 0 is the only document number in the index, a return target == 0; might be nice for skipTo(). It doesn't really help performance, though, and the next() works just as well. Regards, Paul

Re: [Performance] Streaming main memory indexing of single strings

2005-05-02 Thread Wolfgang Hoschek
Thanks! Wolfgang. I've committed this change after it successfully worked for me. Thanks! Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

[Patch] IndexReader.finalize() performance

2005-04-28 Thread Wolfgang Hoschek
Here is the first and most high-priority patch I've settled on to get Lucene to work efficiently for the typical usage scenarios of MemoryIndex. More patches are forthcoming if this one is received favourably... There's large overhead involved in forcing all IndexReader impls to have a

Re: [Performance] Streaming main memory indexing of single strings

2005-04-27 Thread Wolfgang Hoschek
Whichever place you settle on is fine with me. [In case it might make a difference: Just note that MemoryIndex has a small auxiliary dependency on PatternAnalyzer in addField() because the Analyzer superclass doesn't have a tokenStream(String fieldName, String text) method. And PatternAnalyzer

Re: [Performance] Streaming main memory indexing of single strings

2005-04-27 Thread Wolfgang Hoschek
OK. I'll send an update as soon as I get round to it... Wolfgang. On Apr 27, 2005, at 12:22 PM, Doug Cutting wrote: Erik Hatcher wrote: I'm not quite sure where to put MemoryIndex - maybe it deserves to stand on its own in a new contrib area? That sounds good to me. Ok... once Wolfgang gives me

Re: [Performance] Streaming main memory indexing of single strings

2005-04-26 Thread Wolfgang Hoschek
is running round in the woods), * English)); * /pre On Apr 22, 2005, at 1:53 PM, Wolfgang Hoschek wrote: I've now got the contrib code cleaned up, tested and documented into a decent state, ready for your review and comments. Consider this a formal contrib (Apache license is attached

Re: [Performance] Streaming main memory indexing of single strings

2005-04-22 Thread Wolfgang Hoschek
. Cheers, Wolfgang. On Apr 20, 2005, at 11:26 AM, Wolfgang Hoschek wrote: On Apr 20, 2005, at 9:22 AM, Erik Hatcher wrote: On Apr 20, 2005, at 12:11 PM, Wolfgang Hoschek wrote: By the way, by now I have a version against 1.4.3 that is 10-100 times faster (i.e. 3 - 20 index+query steps/sec

Re: [Performance] Streaming main memory indexing of single strings

2005-04-20 Thread Wolfgang Hoschek
in the first place!) Luc -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Saturday, April 16, 2005 2:09 AM To: java-dev@lucene.apache.org Subject: Re: [Performance] Streaming main memory indexing of single strings On Apr 15, 2005, at 6:15 PM, Wolfgang Hoschek wrote: Cool

  1   2   >