[jira] [Commented] (SOLR-1632) Distributed IDF

2013-12-09 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843025#comment-13843025
 ] 

Markus Jelsma commented on SOLR-1632:
-

It is much faster now, even usable. But i haven't tried it in a larger cluster 
yet.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-3191) field exclusion from fl

2013-12-09 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-3191:
---

Assignee: (was: Shalin Shekhar Mangar)

I don't have time right now to review this. I assigned it to myself because 
there was a lot of public interest but no assignee. However it looks like a 
couple of other committers have interest in this issue as well. I can only look 
at this after a few weeks so if no one takes it up, then I will.

 field exclusion from fl
 ---

 Key: SOLR-3191
 URL: https://issues.apache.org/jira/browse/SOLR-3191
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna
Priority: Minor
 Attachments: SOLR-3191.patch, SOLR-3191.patch


 I think it would be useful to add a way to exclude field from the Solr 
 response. If I have for example 100 stored fields and I want to return all of 
 them but one, it would be handy to list just the field I want to exclude 
 instead of the 99 fields for inclusion through fl.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843070#comment-13843070
 ] 

ASF subversion and git services commented on SOLR-5525:
---

Commit 1549552 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1549552 ]

SOLR-5525

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843074#comment-13843074
 ] 

ASF subversion and git services commented on SOLR-5525:
---

Commit 1549554 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1549554 ]

SOLR-5525

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4386) Variable expansion doesn't work in DIH SimplePropertiesWriter's filename

2013-12-09 Thread Ryuzo Yamamoto (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843099#comment-13843099
 ] 

Ryuzo Yamamoto commented on SOLR-4386:
--

Hi!
Do you have plan to fix this?
I also want to use variable expansion in SimplePropertiesWriter's filename.

 Variable expansion doesn't work in DIH SimplePropertiesWriter's filename
 

 Key: SOLR-4386
 URL: https://issues.apache.org/jira/browse/SOLR-4386
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Jonas Birgander
Assignee: Shalin Shekhar Mangar
  Labels: dataimport
 Attachments: SOLR-4386.patch


 I'm testing Solr 4.1, but I've run into some problems with 
 DataImportHandler's new propertyWriter tag.
 I'm trying to use variable expansion in the `filename` field when using 
 SimplePropertiesWriter.
 Here are the relevant parts of my configuration:
 conf/solrconfig.xml
 -
 requestHandler name=/dataimport 
 class=org.apache.solr.handler.dataimport.DataImportHandler
   lst name=defaults
 str name=configdb-data-config.xml/str
   /lst
   lst name=invariants
 !-- country_code is available --
 str name=country_code${country_code}/str
 !-- In the real config, more variables are set here --
   /lst
 /requestHandler
 conf/db-data-config.xml
 -
 dataConfig
   propertyWriter
 dateFormat=-MM-dd HH:mm:ss
 type=SimplePropertiesWriter
 directory=conf
 filename=${dataimporter.request.country_code}.dataimport.properties
 /
   dataSource type=JdbcDataSource
 driver=${dataimporter.request.db_driver}
 url=${dataimporter.request.db_url}
 user=${dataimporter.request.db_user}
 password=${dataimporter.request.db_password}
 batchSize=${dataimporter.request.db_batch_size} /
   document
 entity name=item
   query=my normal SQL, not really relevant
 -- country=${dataimporter.request.country_code}
   field column=id/
 !-- ...more field tags... --
 field column=$deleteDocById/
   field column=$skipDoc/
 /entity
   /document
 /dataConfig
 If country_code is set to gb, I want the last_index_time to be read and 
 written in the file conf/gb.dataimport.properties, instead of the default 
 conf/dataimport.properties
 The variable expansion works perfectly in the SQL and setup of the data 
 source, but not in the property writer's filename field.
 When initiating an import, the log file shows:
 Jan 30, 2013 11:25:42 AM org.apache.solr.handler.dataimport.DataImporter 
 maybeReloadConfiguration
 INFO: Loading DIH Configuration: db-data-config.xml
 Jan 30, 2013 11:25:42 AM 
 org.apache.solr.handler.dataimport.config.ConfigParseUtil verifyWithSchema
 INFO: The field :$skipDoc present in DataConfig does not have a counterpart 
 in Solr Schema
 Jan 30, 2013 11:25:42 AM 
 org.apache.solr.handler.dataimport.config.ConfigParseUtil verifyWithSchema
 INFO: The field :$deleteDocById present in DataConfig does not have a 
 counterpart in Solr Schema
 Jan 30, 2013 11:25:42 AM org.apache.solr.handler.dataimport.DataImporter 
 loadDataConfig
 INFO: Data Configuration loaded successfully
 Jan 30, 2013 11:25:42 AM org.apache.solr.handler.dataimport.DataImporter 
 doFullImport
 INFO: Starting Full Import
 Jan 30, 2013 11:25:42 AM 
 org.apache.solr.handler.dataimport.SimplePropertiesWriter 
 readIndexerProperties
 WARNING: Unable to read: 
 ${dataimporter.request.country_code}.dataimport.properties



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843217#comment-13843217
 ] 

ASF subversion and git services commented on SOLR-5525:
---

Commit 1549591 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1549591 ]

SOLR-5525

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843220#comment-13843220
 ] 

ASF subversion and git services commented on SOLR-5525:
---

Commit 1549592 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1549592 ]

SOLR-5525

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-09 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-5525.
--

Resolution: Fixed

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5473) Make one state.json per collection

2013-12-09 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5473:
-

Attachment: SOLR-5473.patch

a couple of tests fail

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843343#comment-13843343
 ] 

Mark Miller commented on SOLR-1301:
---

bq. if we need some of the classes this jar provides, we should declare direct 
dependencies on the appropriate artifacts.

Right - Wolfgang likely knows best when it comes to Morphlines.. At a minimum 
we should pull the necessary jars in explicitly I think. I've got to take a 
look at what they are.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5467) Provide Solr Ref Guide in .epub format

2013-12-09 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843413#comment-13843413
 ] 

Hoss Man commented on SOLR-5467:


Thread where this initially came up: 
https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201311.mbox/%3c528a1321.4060...@hebis.uni-frankfurt.de%3E

 Provide Solr Ref Guide in .epub format
 --

 Key: SOLR-5467
 URL: https://issues.apache.org/jira/browse/SOLR-5467
 Project: Solr
  Issue Type: Wish
  Components: documentation
Reporter: Cassandra Targett

 From the solr-user list, a request for an .epub version of the Solr Ref Guide.
 There are two possible approaches that immediately come to mind:
 * Ask infra to install a plugin that automatically outputs the Confluence 
 pages in .epub
 * Investigate converting HTML export to .epub with something like calibre
 There might be other options, and there would be additional issues for 
 automating the process of creation and publication, so for now just recording 
 the request with an issue.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread wolfgang hoschek (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843443#comment-13843443
 ] 

wolfgang hoschek commented on SOLR-1301:


I'm not aware of anything needing jersey except perhaps hadoop pulls that in.

The combined dependencies of all morphline modules is here: 
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/cdk-morphlines-all/dependencies.html

The dependencies of each individual morphline modules is here: 
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/cdk-morphlines-all/dependencies.html

The source and POMs are here, as usual: 
https://github.com/cloudera/cdk/tree/master/cdk-morphlines

By the way, a somewhat separate issue is that it seems to me that the ivy 
dependences for solr-morphlines-core and solr-morphlines-cell and 
solr-map-reduce are a bit backwards upstream in that solr-morphlines-core pulls 
in a ton of dependencies that it doesn't need, and those deps should rather be 
pulled in by the solr-map-reduce (which is a essentially an out-of-the-box 
app). Would be good to organize ivy and mvn upstream in such a way that 

* solr-map-reduce should depend on solr-morphlines-cell plus cdk-morphlines-all 
plus xyz
* solr-morphlines-cell should depend on solr-morphlines-core plus xyz
* solr-morphlines-core should depend on cdk-morphlines-core plus xyz 

More concretely, FWIW, to see how the deps look like in production releases 
downstream review the following POMs: 

https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-solr-core/pom.xml

and

https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-solr-cell/pom.xml

and

https://github.com/cloudera/search/blob/master_1.1.0/search-mr/pom.xml

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an 

[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread wolfgang hoschek (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843443#comment-13843443
 ] 

wolfgang hoschek edited comment on SOLR-1301 at 12/9/13 7:30 PM:
-

I'm not aware of anything needing jersey except perhaps hadoop pulls that in.

The combined dependencies of all morphline modules is here: 
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/cdk-morphlines-all/dependencies.html

The dependencies of each individual morphline modules is here: 
http://cloudera.github.io/cdk/docs/current/dependencies.html

The source and POMs are here, as usual: 
https://github.com/cloudera/cdk/tree/master/cdk-morphlines

By the way, a somewhat separate issue is that it seems to me that the ivy 
dependences for solr-morphlines-core and solr-morphlines-cell and 
solr-map-reduce are a bit backwards upstream in that currently 
solr-morphlines-core pulls in a ton of dependencies that it doesn't need, and 
those deps should rather be pulled in by the solr-map-reduce (which is a 
essentially an out-of-the-box app that bundles user level deps). 
Correspondingly, would be good to organize ivy and mvn upstream in such a way 
that 

* solr-map-reduce should depend on solr-morphlines-cell plus cdk-morphlines-all 
minus cdk-morphlines-solr-cell (now upstream) minus cdk-morphlines-solr-core 
(now upstream) plus xyz
* solr-morphlines-cell should depend on solr-morphlines-core plus xyz
* solr-morphlines-core should depend on cdk-morphlines-core plus xyz 

More concretely, FWIW, to see how the deps look like in production releases 
downstream review the following POMs: 

https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-solr-core/pom.xml

and

https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-solr-cell/pom.xml

and

https://github.com/cloudera/search/blob/master_1.1.0/search-mr/pom.xml


was (Author: whoschek):
I'm not aware of anything needing jersey except perhaps hadoop pulls that in.

The combined dependencies of all morphline modules is here: 
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/cdk-morphlines-all/dependencies.html

The dependencies of each individual morphline modules is here: 
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/cdk-morphlines-all/dependencies.html

The source and POMs are here, as usual: 
https://github.com/cloudera/cdk/tree/master/cdk-morphlines

By the way, a somewhat separate issue is that it seems to me that the ivy 
dependences for solr-morphlines-core and solr-morphlines-cell and 
solr-map-reduce are a bit backwards upstream in that solr-morphlines-core pulls 
in a ton of dependencies that it doesn't need, and those deps should rather be 
pulled in by the solr-map-reduce (which is a essentially an out-of-the-box 
app). Would be good to organize ivy and mvn upstream in such a way that 

* solr-map-reduce should depend on solr-morphlines-cell plus cdk-morphlines-all 
plus xyz
* solr-morphlines-cell should depend on solr-morphlines-core plus xyz
* solr-morphlines-core should depend on cdk-morphlines-core plus xyz 

More concretely, FWIW, to see how the deps look like in production releases 
downstream review the following POMs: 

https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-solr-core/pom.xml

and

https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-solr-cell/pom.xml

and

https://github.com/cloudera/search/blob/master_1.1.0/search-mr/pom.xml

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 

[jira] [Commented] (SOLR-5473) Make one state.json per collection

2013-12-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843462#comment-13843462
 ] 

Mark Miller commented on SOLR-5473:
---

bq. if(debugState 

Best to do that with debug logging level rather than introduce a debug sys prop 
for this class.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5542) Global query parameters to facet queries

2013-12-09 Thread Isaac Hebsh (JIRA)
Isaac Hebsh created SOLR-5542:
-

 Summary: Global query parameters to facet queries
 Key: SOLR-5542
 URL: https://issues.apache.org/jira/browse/SOLR-5542
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.6
Reporter: Isaac Hebsh


(From the Mailing List)

It seems that a facet query does not use the global query parameters (for 
example, field aliasing for edismax parser).
We have an intensive use of facet queries (in some cases, we have a lot of 
facet.query for a single q), and the using of LocalParams for each facet.query 
is not convenient.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843496#comment-13843496
 ] 

Steve Rowe commented on SOLR-1301:
--

[~whoschek], I'm lost: what do you mean by upstream/downstream?  In my 
experience, upstream refers to a parent project, i.e. one from which the 
project in question is derived, and downstream is the child/derived project.  
I don't know the history here, but you seem to be referring to the solr 
contribs when you say upstream?  If that's true, then my understanding of 
these terms is the opposite of how you're using them.  Maybe the question I 
should be asking is: what is/are the relationship(s) between/among 
cdk-morphlines-solr-* and solr-morphlines-*?

And (I assume) relatedly, how how does cdk-morphlines-all relate to 
cdk-morphlines-solr-core/-cell?

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843524#comment-13843524
 ] 

Steve Rowe commented on SOLR-1301:
--

bq.  And (I assume) relatedly, how how does cdk-morphlines-all relate to 
cdk-morphlines-solr-core/-cell?

I can answer this one myself from 
[https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-all/pom.xml]:
 it's an aggregation-only modules that depends on all of the cdk-morphlines-* 
modules.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread wolfgang hoschek (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843523#comment-13843523
 ] 

wolfgang hoschek commented on SOLR-1301:


Apologies for the confusion. We are upstreaming cdk-morphlines-solr-cell into 
the solr contrib solr-morphlines-cell as well as cdk-morphlines-solr-core into 
the solr contrib solr-morphlines-core as well as search-mr into the solr 
contrib solr-map-reduce. Once the upstreaming is done these old modules will go 
away. Next, downstream will be made identical to upstream plus perhaps some 
critical fixes as necessary, and the upstream/downstream terms will apply in 
the way folks usually think about them, but we are not quite yet there today, 
but getting there...

cdk-morphlines-all is simply a convenience pom that includes all the other 
morphline poms so there's less to type for users who like a bit more auto magic.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843524#comment-13843524
 ] 

Steve Rowe edited comment on SOLR-1301 at 12/9/13 8:34 PM:
---

bq.  And (I assume) relatedly, how how does cdk-morphlines-all relate to 
cdk-morphlines-solr-core/-cell?

I can answer this one myself from 
[https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-all/pom.xml]:
 it's an aggregation-only module that depends on all of the cdk-morphlines-* 
modules.


was (Author: steve_rowe):
bq.  And (I assume) relatedly, how how does cdk-morphlines-all relate to 
cdk-morphlines-solr-core/-cell?

I can answer this one myself from 
[https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-all/pom.xml]:
 it's an aggregation-only modules that depends on all of the cdk-morphlines-* 
modules.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2013-12-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5541:
-

Attachment: SOLR-5541.patch

Added test case

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4983) Problematic core naming by collection create API

2013-12-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843571#comment-13843571
 ] 

Mark Miller commented on SOLR-4983:
---

bq. could anyone suggest if by creating cores separately (with the same 
collection name) we would achieve the same effect as creating collection via 
Collections API? 

By and large, currently, yes, this is supported. There is a flag that tracks if 
the collection was created with the collections api or not - and if it is, you 
will end up being able to use further features in the future - but currently 
you should be able to use the cores api to do what you want no problem.

 Problematic core naming by collection create API 
 -

 Key: SOLR-4983
 URL: https://issues.apache.org/jira/browse/SOLR-4983
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Chris Toomey

 The SolrCloud collection create API creates cores named 
 foo_shardx_replicay when asked to create collection foo.
 This is problematic for at least 2 reasons: 
 1) these ugly core names show up in the core admin UI, and will vary 
 depending on which node is being used,
 2) it prevents collections from being used in SolrCloud joins, since join 
 takes a core name as the fromIndex parameter and there's no single core name 
 for the collection.  As I've documented in 
 https://issues.apache.org/jira/browse/SOLR-4905 and 
 http://lucene.472066.n3.nabble.com/Joins-with-SolrCloud-tp4073199p4074038.html,
  SolrCloud join does work when the inner collection (fromIndex) is not 
 sharded, assuming that collection is available and initialized at SolrCloud 
 bootstrap time.
 Could this be changed to instead use the collection name for the core name?  
 Or at least add a core-name option to the API?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 5920 - Failure!

2013-12-09 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/5920/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestIndexableField.testArbitraryFields

Error Message:
-60

Stack Trace:
java.lang.ArrayIndexOutOfBoundsException: -60
at 
__randomizedtesting.SeedInfo.seed([516D1CE5843E2B26:7C99D9D3760B2809]:0)
at java.util.ArrayList.get(ArrayList.java:324)
at 
org.apache.lucene.search.RandomSimilarityProvider.get(RandomSimilarityProvider.java:106)
at 
org.apache.lucene.search.similarities.PerFieldSimilarityWrapper.computeNorm(PerFieldSimilarityWrapper.java:45)
at 
org.apache.lucene.index.NormsConsumerPerField.finish(NormsConsumerPerField.java:49)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:201)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1520)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1190)
at 
org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
at 
org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
at 
org.apache.lucene.index.TestIndexableField.testArbitraryFields(TestIndexableField.java:191)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2013-12-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843573#comment-13843573
 ] 

Mark Miller commented on SOLR-5541:
---

+1

One comment:

+  assertQ(All six should make it, req

Should update the copy/paste assert comment - only 5 should make it because b 
is excluded.

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 5920 - Failure!

2013-12-09 Thread Simon Willnauer
nice one we ran into Math.abs(Integer.MIN_VALUE) which returns -1

On Mon, Dec 9, 2013 at 9:24 PM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/5920/

 1 tests failed.
 REGRESSION:  org.apache.lucene.index.TestIndexableField.testArbitraryFields

 Error Message:
 -60

 Stack Trace:
 java.lang.ArrayIndexOutOfBoundsException: -60
 at 
 __randomizedtesting.SeedInfo.seed([516D1CE5843E2B26:7C99D9D3760B2809]:0)
 at java.util.ArrayList.get(ArrayList.java:324)
 at 
 org.apache.lucene.search.RandomSimilarityProvider.get(RandomSimilarityProvider.java:106)
 at 
 org.apache.lucene.search.similarities.PerFieldSimilarityWrapper.computeNorm(PerFieldSimilarityWrapper.java:45)
 at 
 org.apache.lucene.index.NormsConsumerPerField.finish(NormsConsumerPerField.java:49)
 at 
 org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:201)
 at 
 org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
 at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
 at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1520)
 at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1190)
 at 
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
 at 
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
 at 
 org.apache.lucene.index.TestIndexableField.testArbitraryFields(TestIndexableField.java:191)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 

Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 5920 - Failure!

2013-12-09 Thread Simon Willnauer
I committed a fix

On Mon, Dec 9, 2013 at 9:36 PM, Simon Willnauer sim...@apache.org wrote:
 nice one we ran into Math.abs(Integer.MIN_VALUE) which returns -1

 On Mon, Dec 9, 2013 at 9:24 PM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/5920/

 1 tests failed.
 REGRESSION:  org.apache.lucene.index.TestIndexableField.testArbitraryFields

 Error Message:
 -60

 Stack Trace:
 java.lang.ArrayIndexOutOfBoundsException: -60
 at 
 __randomizedtesting.SeedInfo.seed([516D1CE5843E2B26:7C99D9D3760B2809]:0)
 at java.util.ArrayList.get(ArrayList.java:324)
 at 
 org.apache.lucene.search.RandomSimilarityProvider.get(RandomSimilarityProvider.java:106)
 at 
 org.apache.lucene.search.similarities.PerFieldSimilarityWrapper.computeNorm(PerFieldSimilarityWrapper.java:45)
 at 
 org.apache.lucene.index.NormsConsumerPerField.finish(NormsConsumerPerField.java:49)
 at 
 org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:201)
 at 
 org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
 at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
 at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1520)
 at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1190)
 at 
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
 at 
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
 at 
 org.apache.lucene.index.TestIndexableField.testArbitraryFields(TestIndexableField.java:191)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2013-12-09 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843584#comment-13843584
 ] 

Joel Bernstein commented on SOLR-5541:
--

Thanks Mark, I'll fix that up.

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 5920 - Failure!

2013-12-09 Thread Dawid Weiss
Hurray for the improbable... :)

D.

On Mon, Dec 9, 2013 at 10:36 PM, Simon Willnauer sim...@apache.org wrote:
 nice one we ran into Math.abs(Integer.MIN_VALUE) which returns -1

 On Mon, Dec 9, 2013 at 9:24 PM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/5920/

 1 tests failed.
 REGRESSION:  org.apache.lucene.index.TestIndexableField.testArbitraryFields

 Error Message:
 -60

 Stack Trace:
 java.lang.ArrayIndexOutOfBoundsException: -60
 at 
 __randomizedtesting.SeedInfo.seed([516D1CE5843E2B26:7C99D9D3760B2809]:0)
 at java.util.ArrayList.get(ArrayList.java:324)
 at 
 org.apache.lucene.search.RandomSimilarityProvider.get(RandomSimilarityProvider.java:106)
 at 
 org.apache.lucene.search.similarities.PerFieldSimilarityWrapper.computeNorm(PerFieldSimilarityWrapper.java:45)
 at 
 org.apache.lucene.index.NormsConsumerPerField.finish(NormsConsumerPerField.java:49)
 at 
 org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:201)
 at 
 org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
 at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
 at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1520)
 at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1190)
 at 
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
 at 
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
 at 
 org.apache.lucene.index.TestIndexableField.testArbitraryFields(TestIndexableField.java:191)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 

[jira] [Updated] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2013-12-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5541:
-

Attachment: SOLR-5541.patch

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5364) Review usages of hard-coded Version constants

2013-12-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843652#comment-13843652
 ] 

ASF subversion and git services commented on LUCENE-5364:
-

Commit 1549701 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1549701 ]

LUCENE-5364: Replace hard-coded Version.LUCENE_XY that doesn't have to be 
hard-coded (because of back-compat testing or version dependent behavior, or 
demo code that should exemplify pinning versions in user code), with 
Version.LUCENE_CURRENT in non-test code, or with 
LuceneTestCase.TEST_VERSION_CURRENT in test code; upgrade hard-coded 
Version.LUCENE_XY constants that should track the next release version to the 
next release version if they aren't already there, and put a token near them so 
that they can be found and upgraded when the next release version changes: 
':Post-Release-Update-Version.LUCENE_XY:'

 Review usages of hard-coded Version constants
 -

 Key: LUCENE-5364
 URL: https://issues.apache.org/jira/browse/LUCENE-5364
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 5.0, 4.7
Reporter: Steve Rowe
Priority: Minor
 Attachments: LUCENE-5364-branch_4x.patch, LUCENE-5364-trunk.patch, 
 LUCENE-5364-trunk.patch


 There are some hard-coded {{Version.LUCENE_XY}} constants used in various 
 places.  Some of these are intentional and appropriate:
 * in deprecated code, e.g. {{ArabicLetterTokenizer}}, deprecated in 3.1, uses 
 {{Version.LUCENE_31}}
 * to make behavior version-dependent (e.g. {{StandardTokenizer}} and other 
 analysis components)
 * to test different behavior at different points in history (e.g. 
 {{TestStopFilter}} to test position increments)
 But should hard-coded constants be used elsewhere?
 For those that should remain, and need to be updated with each release, there 
 should be an easy way to find them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5364) Review usages of hard-coded Version constants

2013-12-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843658#comment-13843658
 ] 

ASF subversion and git services commented on LUCENE-5364:
-

Commit 1549703 from [~steve_rowe] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1549703 ]

LUCENE-5364: Replace hard-coded Version.LUCENE_XY that doesn't have to be 
hard-coded (because of back-compat testing or version dependent behavior, or 
demo code that should exemplify pinning versions in user code), with 
Version.LUCENE_CURRENT in non-test code, or with 
LuceneTestCase.TEST_VERSION_CURRENT in test code; upgrade hard-coded 
Version.LUCENE_XY constants that should track the next release version to the 
next release version if they aren't already there, and put a token near them so 
that they can be found and upgraded when the next release version changes: 
':Post-Release-Update-Version.LUCENE_XY:' (merge trunk r1549701)

 Review usages of hard-coded Version constants
 -

 Key: LUCENE-5364
 URL: https://issues.apache.org/jira/browse/LUCENE-5364
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 5.0, 4.7
Reporter: Steve Rowe
Priority: Minor
 Attachments: LUCENE-5364-branch_4x.patch, LUCENE-5364-trunk.patch, 
 LUCENE-5364-trunk.patch


 There are some hard-coded {{Version.LUCENE_XY}} constants used in various 
 places.  Some of these are intentional and appropriate:
 * in deprecated code, e.g. {{ArabicLetterTokenizer}}, deprecated in 3.1, uses 
 {{Version.LUCENE_31}}
 * to make behavior version-dependent (e.g. {{StandardTokenizer}} and other 
 analysis components)
 * to test different behavior at different points in history (e.g. 
 {{TestStopFilter}} to test position increments)
 But should hard-coded constants be used elsewhere?
 For those that should remain, and need to be updated with each release, there 
 should be an easy way to find them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2013-12-09 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5463:
---

Attachment: SOLR-5463__straw_man.patch

Ok, updated patch making the change in user semantics I mentioned wanting to 
try last week.  Best way to explain it is with a walk through of a simple 
example (note: if you try the current strawman code, the numFound and start 
values returned in the docList don't match what i've pasted in the examples 
below -- these examples show what the final results should look like in the 
finished solution)

Initial requests using searchAfter should always start with a totem value of 
{{\*}}

{code:title=http://localhost:8983/solr/deep?q=*:*rows=20sort=id+descsearchAfter=*}
{
  responseHeader:{
status:0,
QTime:2},
  response:{numFound:32,start:-1,docs:[
  // ...20 docs here...
]
  },
  nextSearchAfter:AoEjTk9L}
{code}

The {{nextSearchAfter}} token returned by this request tells us what to use in 
the second request...

{code:title=http://localhost:8983/solr/deep?q=*:*rows=20sort=id+descsearchAfter=AoEjTk9L}
{
  responseHeader:{
status:0,
QTime:7},
  response:{numFound:32,start:-1,docs:[
  // ...12 docs here...
]
  },
  nextSearchAfter:AoEoMDU3OUIwMDI=}
{code}

Since this result block contains fewer rows then were requested, the client 
could automatically stop, but the {{nextSearchAfter}} is still returned, and 
it's still safe to request a subsequent page (this is the fundemental diff from 
the previous patches, where {{nextSearchAfter}} was set to {{null}} anytime the 
code could tell there were no more results ...

{code:title=http://localhost:8983/solr/deep?q=*:*wt=jsonindent=truerows=20fl=id,pricesort=id+descsearchAfter=AoEoMDU3OUIwMDI=}
{
  responseHeader:{
status:0,
QTime:1},
  response:{numFound:32,start:-1,docs:[]
  },
  nextSearchAfter:AoEoMDU3OUIwMDI=}
{code}

Note that in this case, with no docs included in the response, the 
{{nextSearchAfter}} totem is the same as the input.

For some sorts this makes it possible for clients to resume a full walk of 
all documents matching a query -- picking up where they let off if more 
documents are added to the index that match (for example: when doing an 
ascending sort on a numeric uniqueKey field that always increases as new docs 
are added, sorting by a timestamp field (asc) indicating when documents are 
crawled, etc...)

This also works as you would expect for searches that don't match any 
documents...

{code:title=http://localhost:8983/solr/deep?q=text:bogusrows=20sort=id+descsearchAfter=*}
{
  responseHeader:{
status:0,
QTime:21},
  response:{numFound:0,start:-1,docs:[]
  },
  nextSearchAfter:*}
{code}


 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: 

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2013-12-09 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843687#comment-13843687
 ] 

Hoss Man commented on SOLR-5463:


The one significant change i still want to make before abandoming this straw 
man and moving on to using PaginatingCollector under the covers is to rethink 
the vocabulary.

at the Lucene/IndexSearcher level, this functionality is leveraged using a 
searchAfter param which indicates the exact FieldDoc returned by a previous 
search.  The name makes a lot of sense in this API given that the FieldDoc you 
specify is expected to come from a previous search, and you are specifying that 
you want to search for documents after this document in the ocntext of the 
specified query/sort.

For the Solr request API however, I feel like this terminology might confuse 
people.  I'm concerned people might think they can use the uniqueKey of the 
last document they got on the previous page (instead of realizing they need to 
specify the special token they were returned as part of that page).

My thinking is that from a user perspective, we should call this functionality 
a Result Cursor and rename the request param and response key appropriately. 
something along the lines of...

{code:title=http://localhost:8983/solr/deep?q=*:*rows=20sort=id+desccursor=AoEjTk9L}
{
  responseHeader:{
status:0,
QTime:7},
  response:{numFound:32,start:-1,docs:[
  // ... docs here...
]
  },
  cursorContinue:AoEoMDU3OUIwMDI=}
{code}

* searchAfter = cursor
* nextSearchAfter = cursorContinue

What do folks think?


 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5364) Review usages of hard-coded Version constants

2013-12-09 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-5364.


   Resolution: Fixed
Fix Version/s: 4.7
   5.0
 Assignee: Steve Rowe
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and branch_4x.

I added a note to the Lucene ReleaseToDo wiki page about using 
{{:Post-Release-Update-Version.LUCENE_XY:}} to find constants that should be 
upgraded to the next release version after a release branch has been cut.

 Review usages of hard-coded Version constants
 -

 Key: LUCENE-5364
 URL: https://issues.apache.org/jira/browse/LUCENE-5364
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 5.0, 4.7
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: LUCENE-5364-branch_4x.patch, LUCENE-5364-trunk.patch, 
 LUCENE-5364-trunk.patch


 There are some hard-coded {{Version.LUCENE_XY}} constants used in various 
 places.  Some of these are intentional and appropriate:
 * in deprecated code, e.g. {{ArabicLetterTokenizer}}, deprecated in 3.1, uses 
 {{Version.LUCENE_31}}
 * to make behavior version-dependent (e.g. {{StandardTokenizer}} and other 
 analysis components)
 * to test different behavior at different points in history (e.g. 
 {{TestStopFilter}} to test position increments)
 But should hard-coded constants be used elsewhere?
 For those that should remain, and need to be updated with each release, there 
 should be an easy way to find them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2013-12-09 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843748#comment-13843748
 ] 

Steve Rowe commented on SOLR-5463:
--

{quote}
* searchAfter = cursor
* nextSearchAfter = cursorContinue
{quote}

+1

bq. I'm concerned people might think they can use the uniqueKey of the last 
document they got on the previous page

I tried making this mistake (using the trailing unique id (NOK in this 
example) as the searchAfter param value, and I got the following error message:

{code}
{
  responseHeader:{
status:400,
QTime:2},
  error:{
msg:Unable to parse search after totem: NOK,
code:400}}
{code}

I think that error message should include the param name ({{cursorContinue}}) 
that couldn't be parsed.

Also, maybe it would be useful to include a prefix that will (probably) never 
be used in unique ids, to visually identify the cursor as such: like always 
prepending '*'?  So your example of the future would become:

{code:title=http://localhost:8983/solr/deep?q=*:*rows=20sort=id+desccursor=*AoEjTk9L}
{
  responseHeader:{
status:0,
QTime:7},
  response:{numFound:32,start:-1,docs:[
  // ... docs here...
]
  },
  cursorContinue:*AoEoMDU3OUIwMDI=}
{code}

The error message when someone gives an unparseable {{cursor}} could then 
include this piece of information: cursors begin with an asterisk.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5473) Make one state.json per collection

2013-12-09 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843749#comment-13843749
 ] 

Timothy Potter commented on SOLR-5473:
--

Thanks for fixing the CloudSolrServerTest failure ... One thing I wasn't sure 
about when looking over the latest patch was whether allCollections in 
ZkStateReader will hold the names of external collections? I assume so by the 
name *all* but it doesn't seem like any external collection names are added to 
that Set currently. 

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2013-12-09 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843748#comment-13843748
 ] 

Steve Rowe edited comment on SOLR-5463 at 12/10/13 12:21 AM:
-

{quote}
* searchAfter = cursor
* nextSearchAfter = cursorContinue
{quote}

+1

bq. I'm concerned people might think they can use the uniqueKey of the last 
document they got on the previous page

I tried making this mistake (using the trailing unique id (NOK in this 
example) as the searchAfter param value, and I got the following error message:

{code}
{
  responseHeader:{
status:400,
QTime:2},
  error:{
msg:Unable to parse search after totem: NOK,
code:400}}
{code}

(*edit*: {{cursorContinue}} = {{cursor}} in the sentence below)

I think that error message should include the param name ({{cursor}}) that 
couldn't be parsed.

Also, maybe it would be useful to include a prefix that will (probably) never 
be used in unique ids, to visually identify the cursor as such: like always 
prepending '*'?  So your example of the future would become:

{code:title=http://localhost:8983/solr/deep?q=*:*rows=20sort=id+desccursor=*AoEjTk9L}
{
  responseHeader:{
status:0,
QTime:7},
  response:{numFound:32,start:-1,docs:[
  // ... docs here...
]
  },
  cursorContinue:*AoEoMDU3OUIwMDI=}
{code}

The error message when someone gives an unparseable {{cursor}} could then 
include this piece of information: cursors begin with an asterisk.


was (Author: steve_rowe):
{quote}
* searchAfter = cursor
* nextSearchAfter = cursorContinue
{quote}

+1

bq. I'm concerned people might think they can use the uniqueKey of the last 
document they got on the previous page

I tried making this mistake (using the trailing unique id (NOK in this 
example) as the searchAfter param value, and I got the following error message:

{code}
{
  responseHeader:{
status:400,
QTime:2},
  error:{
msg:Unable to parse search after totem: NOK,
code:400}}
{code}

I think that error message should include the param name ({{cursorContinue}}) 
that couldn't be parsed.

Also, maybe it would be useful to include a prefix that will (probably) never 
be used in unique ids, to visually identify the cursor as such: like always 
prepending '*'?  So your example of the future would become:

{code:title=http://localhost:8983/solr/deep?q=*:*rows=20sort=id+desccursor=*AoEjTk9L}
{
  responseHeader:{
status:0,
QTime:7},
  response:{numFound:32,start:-1,docs:[
  // ... docs here...
]
  },
  cursorContinue:*AoEoMDU3OUIwMDI=}
{code}

The error message when someone gives an unparseable {{cursor}} could then 
include this piece of information: cursors begin with an asterisk.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional 

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2013-12-09 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843774#comment-13843774
 ] 

Steve Rowe commented on SOLR-5463:
--

Another idea about the cursor: the Base64-encoded text is used verbatim, 
including the trailing padding '=' characters - these could be stripped out for 
external use (since they're there just to make the string length divisible by 
four), and then added back before Base64-decoding.  In a URL non-metacharacter 
'='-s look weird, since they're already used to separate param names and values.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843827#comment-13843827
 ] 

Mark Miller commented on SOLR-1301:
---

bq. I'm not aware of anything needing jersey except perhaps hadoop pulls that 
in.

Yeah, tests use this for running hadoop.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4983) Problematic core naming by collection create API

2013-12-09 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843840#comment-13843840
 ] 

Noble Paul commented on SOLR-4983:
--

I think solving HIS problem alone is simple. If the collection is present in 
the same jvm it is very easy to do a lookup of the collection  and of there is 
a core that serves the collection set the fromIndex as that. If the user can 
ensure that all his collections are present in all nodes it will be ok. The 
hard part is making it work with a remote node 


 Problematic core naming by collection create API 
 -

 Key: SOLR-4983
 URL: https://issues.apache.org/jira/browse/SOLR-4983
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Chris Toomey

 The SolrCloud collection create API creates cores named 
 foo_shardx_replicay when asked to create collection foo.
 This is problematic for at least 2 reasons: 
 1) these ugly core names show up in the core admin UI, and will vary 
 depending on which node is being used,
 2) it prevents collections from being used in SolrCloud joins, since join 
 takes a core name as the fromIndex parameter and there's no single core name 
 for the collection.  As I've documented in 
 https://issues.apache.org/jira/browse/SOLR-4905 and 
 http://lucene.472066.n3.nabble.com/Joins-with-SolrCloud-tp4073199p4074038.html,
  SolrCloud join does work when the inner collection (fromIndex) is not 
 sharded, assuming that collection is available and initialized at SolrCloud 
 bootstrap time.
 Could this be changed to instead use the collection name for the core name?  
 Or at least add a core-name option to the API?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5473) Make one state.json per collection

2013-12-09 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843933#comment-13843933
 ] 

Noble Paul commented on SOLR-5473:
--

[~timp74]
The allCollections will store ALL collections. If you are looking at the trunk 
. There are no external collections in trunk yet. Please apply the patch and 
check


 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5473) Make one state.json per collection

2013-12-09 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843935#comment-13843935
 ] 

Noble Paul commented on SOLR-5473:
--

bq. if(debugState 

Thanks for the suggestion.However ,I added it for my dev testing will be 
removed before commit. 


 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5543) solr.xml duplicat eentries after SWAP 4.6

2013-12-09 Thread Bill Bell (JIRA)
Bill Bell created SOLR-5543:
---

 Summary: solr.xml duplicat eentries after SWAP 4.6
 Key: SOLR-5543
 URL: https://issues.apache.org/jira/browse/SOLR-5543
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Bill Bell


We are having issues with SWAP CoreAdmin in 4.6.

Using legacy solr.xml we issue a COreodmin SWAP, and we want it persistent. It 
has been running flawless since 4.5. Now it creates duplicate lines in solr.xml.

Even the example multi core schema in doesn't work with persistent=true - it 
creates duplicate lines in solr.xml.

 cores adminPath=/admin/cores
core name=autosuggest loadOnStartup=true instanceDir=autosuggest 
transient=false/

core name=citystateprovider loadOnStartup=true 
instanceDir=citystateprovider transient=false/

core name=collection1 loadOnStartup=true instanceDir=collection1 
transient=false/

core name=facility loadOnStartup=true instanceDir=facility 
transient=false/

core name=inactiveproviders loadOnStartup=true 
instanceDir=inactiveproviders transient=false/

core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
transient=false/

core name=linesvcgeofull instanceDir=linesvcgeofull 
loadOnStartup=true transient=false/

core name=locationgeo loadOnStartup=true instanceDir=locationgeo 
transient=false/

core name=market loadOnStartup=true instanceDir=market 
transient=false/

core name=portalprovider loadOnStartup=true 
instanceDir=portalprovider transient=false/

core name=practice loadOnStartup=true instanceDir=practice 
transient=false/

core name=provider loadOnStartup=true instanceDir=provider 
transient=false/

core name=providersearch loadOnStartup=true 
instanceDir=providersearch transient=false/

core name=tridioncomponents loadOnStartup=true 
instanceDir=tridioncomponents transient=false/

core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
transient=false/

core name=linesvcgeofull instanceDir=linesvcgeofull 
loadOnStartup=true transient=false/
/cores



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5544) Log spamming DefaultSolrHighlighter

2013-12-09 Thread MANISH KUMAR (JIRA)
MANISH KUMAR created SOLR-5544:
--

 Summary: Log spamming DefaultSolrHighlighter
 Key: SOLR-5544
 URL: https://issues.apache.org/jira/browse/SOLR-5544
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0
Reporter: MANISH KUMAR


In DefaultSolrHighlighter.java

The method useFastVectorHighlighter has
 log.warn( Solr will use Highlighter instead of FastVectorHighlighter because 
{} field does not store TermPositions and TermOffsets., fieldName );

Above method gets called each field and there could be cases where 
TermPositions  TermOffsets are not stored.

The above line causes huge spamming of logs.

It should be at max a DEBUG level log which will give flexibility of turning it 
off in production environments.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5544) Log spamming DefaultSolrHighlighter

2013-12-09 Thread MANISH KUMAR (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MANISH KUMAR updated SOLR-5544:
---

Priority: Minor  (was: Major)

 Log spamming DefaultSolrHighlighter
 ---

 Key: SOLR-5544
 URL: https://issues.apache.org/jira/browse/SOLR-5544
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0
Reporter: MANISH KUMAR
Priority: Minor

 In DefaultSolrHighlighter.java
 The method useFastVectorHighlighter has
  log.warn( Solr will use Highlighter instead of FastVectorHighlighter 
 because {} field does not store TermPositions and TermOffsets., fieldName );
 Above method gets called each field and there could be cases where 
 TermPositions  TermOffsets are not stored.
 The above line causes huge spamming of logs.
 It should be at max a DEBUG level log which will give flexibility of turning 
 it off in production environments.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5544) Log spamming by DefaultSolrHighlighter

2013-12-09 Thread MANISH KUMAR (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MANISH KUMAR updated SOLR-5544:
---

Summary: Log spamming by DefaultSolrHighlighter  (was: Log spamming 
DefaultSolrHighlighter)

 Log spamming by DefaultSolrHighlighter
 --

 Key: SOLR-5544
 URL: https://issues.apache.org/jira/browse/SOLR-5544
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0
Reporter: MANISH KUMAR
Priority: Minor

 In DefaultSolrHighlighter.java
 The method useFastVectorHighlighter has
  log.warn( Solr will use Highlighter instead of FastVectorHighlighter 
 because {} field does not store TermPositions and TermOffsets., fieldName );
 Above method gets called each field and there could be cases where 
 TermPositions  TermOffsets are not stored.
 The above line causes huge spamming of logs.
 It should be at max a DEBUG level log which will give flexibility of turning 
 it off in production environments.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org