date:20110207

Loading data to solr from mysql

2011-02-07 Thread Bagesh Sharma


Can anybody suggest me the way to load data from mysql to solr directly.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Loading-data-to-solr-from-mysql-tp2442184p2442184.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Loading data to solr from mysql

2011-02-07 Thread Stefan Matheis

http://wiki.apache.org/solr/DataImportHandler

On Mon, Feb 7, 2011 at 11:16 AM, Bagesh Sharma mail.bag...@gmail.com wrote:

 Can anybody suggest me the way to load data from mysql to solr directly.
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Loading-data-to-solr-from-mysql-tp2442184p2442184.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Solr Error

2011-02-07 Thread Prasad Joshi

Pl share your insights on the error.
Regards,
Prasad
java.lang.OutOfMemoryError: Java heap space
Exception in thread Timer-1   at org.mortbay.util.URIUtil.
decodePath(URIUtil.java:285)
at org.mortbay.jetty.HttpURI.getDecodedPath(HttpURI.java:395)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:486)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
java.lang.OutOfMemoryError: Java heap space
Exception in thread Lucene Merge Thread #0
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
Caused by: java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.UnicodeUtil.UTF16toUTF8(UnicodeUtil.java:236)
at
org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:103)
at
org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:231)
at
org.apache.lucene.index.FieldsWriter.addDocument(FieldsWriter.java:268)
at
org.apache.lucene.index.SegmentMerger.copyFieldsNoDeletions(SegmentMerger.java:451)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:352)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5112)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4675)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)

Re: Solr Error

2011-02-07 Thread Prasad Joshi

I have already allocated abt 2gb -Xmx2048m.
Regards,
Prasad

On 7 February 2011 18:17, Ahmet Arslan iori...@yahoo.com wrote:


  Pl share your insights on the error.
  java.lang.OutOfMemoryError: Java heap space

 What happens if you increase the Java heap space?
 java -Xmx1g -jar start.jar

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-02-07 Thread Markus Jelsma

Heap usage can spike after a commit. Existing caches are still in use and new 
caches are being generated and/or auto warmed. Can you confirm this is the 
case?

On Friday 28 January 2011 00:34:42 Simon Wistow wrote:
 On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said:
  Are you sure you need CMS incremental mode? It's only adviced when
  running on a machine with one or two processors. If you have more you
  should consider disabling the incremental flags.
 
 I'll test agin but we added those to get better performance - not much
 but there did seem to be an improvement.
 
 The problem seems to not be in average use but that occasionally there's
 huge spike in load (there doesn't seem to be a particular killer
 query) and Solr just never recovers.
 
 Thanks,
 
 Simon

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr Error

2011-02-07 Thread Grijesh


What is your index size and Ram you have?

-
Thanx:
Grijesh
http://lucidimagination.com
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Error-tp2442417p2443597.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Indexing Performance

2011-02-07 Thread Gora Mohanty

On Sat, Feb 5, 2011 at 2:06 PM, Darx Oman darxo...@gmail.com wrote:
 I indexed 1000 pdf file with the same configuration, it completed in about
 32 min.

So, it seems like your indexing scales at least as well as the number
of the PDF documents that you have.

While this might be good news in your case, it is difficult to estimate
an expected indexing rate when indexing from documents.

Regards,
Gora

DIH keeps felling during full-import

2011-02-07 Thread Mark

I'm receiving the following exception when trying to perform a 
full-import (~30 hours). Any idea on ways I could fix this?


Is there an easy way to use DIH to break apart a full-import into 
multiple pieces? IE 3 mini-imports instead of 1 large import?


Thanks.




Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource 
closeConnection

SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: 
Communications link failure during rollback(). Transaction resolution 
unknown.
at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390)
at 
org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174)
at 
org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165)
at 
org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource 
closeConnection

SEVERE: Ignoring Error when closing connection
java.sql.SQLException: Streaming result set 
com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements 
may be issued when any streaming result sets are open and in use on a 
given connection. Ensure that you have called .close() on any active 
streaming result sets before attempting more queries.

at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931)
at 
com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724)

at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620)
at 
com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854)

at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390)
at 
org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174)
at 
org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
Feb 7, 2011 7:03:29 AM org.apache.solr.handler.dataimport.JdbcDataSource 
closeConnection

SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: 
Communications link failure during rollback(). Transaction resolution 
unknown.
at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751)
at

Re: DIH keeps failing during full-import

2011-02-07 Thread Mark


Typo in subject

On 2/7/11 7:59 AM, Mark wrote:
I'm receiving the following exception when trying to perform a 
full-import (~30 hours). Any idea on ways I could fix this?


Is there an easy way to use DIH to break apart a full-import into 
multiple pieces? IE 3 mini-imports instead of 1 large import?


Thanks.




Feb 7, 2011 5:52:33 AM 
org.apache.solr.handler.dataimport.JdbcDataSource closeConnection

SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: 
Communications link failure during rollback(). Transaction resolution 
unknown.
at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390)
at 
org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174)
at 
org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165)
at 
org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
Feb 7, 2011 5:52:33 AM 
org.apache.solr.handler.dataimport.JdbcDataSource closeConnection

SEVERE: Ignoring Error when closing connection
java.sql.SQLException: Streaming result set 
com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements 
may be issued when any streaming result sets are open and in use on a 
given connection. Ensure that you have called .close() on any active 
streaming result sets before attempting more queries.

at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931)
at 
com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724)

at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620)
at 
com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854)

at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390)
at 
org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174)
at 
org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
Feb 7, 2011 7:03:29 AM 
org.apache.solr.handler.dataimport.JdbcDataSource closeConnection

SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: 
Communications link failure during rollback(). Transaction resolution 
unknown.
at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
at

Re: DIH keeps felling during full-import

2011-02-07 Thread Gora Mohanty

On Mon, Feb 7, 2011 at 9:29 PM, Mark static.void@gmail.com wrote:
 I'm receiving the following exception when trying to perform a full-import
 (~30 hours). Any idea on ways I could fix this?

 Is there an easy way to use DIH to break apart a full-import into multiple
 pieces? IE 3 mini-imports instead of 1 large import?

 Thanks.




 Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource
 closeConnection
 SEVERE: Ignoring Error when closing connection
 com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
 Communications link failure during rollback(). Transaction resolution
 unknown.
[...]

This looks like a network issue, or some other failure in communicating
with the mysql database. Is that a possibility? Also, how many records
are you importing, what is the data size, what is the quality of the network
connection, etc.?

One way to break up the number of records imported at a time is to
shard your data at at the database level, but the advisability of this
option depends on whether there is a more fundamental issue.

Regards,
Gora

Re: DIH keeps felling during full-import

2011-02-07 Thread Mark

Full import is around 6M documents which when completed totals around 
30GB in size.


Im guessing it could be a database connectivity problem because I also 
see these types of errors on delta-imports which could be anywhere from 
20K to 300K records.


On 2/7/11 8:15 AM, Gora Mohanty wrote:

On Mon, Feb 7, 2011 at 9:29 PM, Markstatic.void@gmail.com  wrote:

I'm receiving the following exception when trying to perform a full-import
(~30 hours). Any idea on ways I could fix this?

Is there an easy way to use DIH to break apart a full-import into multiple
pieces? IE 3 mini-imports instead of 1 large import?

Thanks.




Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource
closeConnection
SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
Communications link failure during rollback(). Transaction resolution
unknown.

[...]

This looks like a network issue, or some other failure in communicating
with the mysql database. Is that a possibility? Also, how many records
are you importing, what is the data size, what is the quality of the network
connection, etc.?

One way to break up the number of records imported at a time is to
shard your data at at the database level, but the advisability of this
option depends on whether there is a more fundamental issue.

Regards,
Gora

hl.snippets in solr 3.1

2011-02-07 Thread alex


hi all,
I'm trying to get result like :
blabla bkeyword/b blabla ... blablabkeyword/b blabla...

so, I'd like to show 2  fragments.I've added these settings
str name=hl.simple.pre![CDATA[b]]/str
str name=hl.simple.post![CDATA[/b]]/str
   str name=f.content.hl.fragsize20/str
str name=f.content.hl.snippets3/str

but I get only 1 fragment blabla bkeyword/b blabla.
Am I trying to do it right way? Is it what can be done via changes in 
config file?


how do I add separator between fragments(like ... in this example)?
thanks.

Re: HTTP ERROR 400 undefined field: *

2011-02-07 Thread Jed Glazner


Thanks Otis,

I'll give that a try.

Jed.

On 02/06/2011 08:06 PM, Otis Gospodnetic wrote:

Yup, here it is, warning about needing to reindex:

http://twitter.com/#!/lucene/status/28694113180192768

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 

From: Erick Ericksonerickerick...@gmail.com
To: solr-user@lucene.apache.org
Sent: Sun, February 6, 2011 9:43:00 AM
Subject: Re: HTTP ERROR 400 undefined field: *

I *think* that there was a post a while ago saying that if you were
using  trunk 3_x one of the recent changes required re-indexing, but don't
quote me  on that.
Have you tried that?

Best
Erick

On Fri, Feb 4, 2011  at 2:04 PM, Jed Glazner
jglaz...@beyondoblivion.comwrote:


  Sorry for the lack of details.

It's all clear in my head..  :)

We checked out the head revision from the 3.x branch a few  weeks ago (
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/).  We
picked up r1058326.

We upgraded from a previous  checkout (r960098). I am using our customized
schema.xml and the  solrconfig.xml from the old revision with the new
  checkout.

After upgrading I just copied the data folders from  each core into the new
checkout (hoping I wouldn't have to re-index the  content, as this takes
days).  Everything seems to work fine,  except that now I can't get the

score

to return.

The  stack trace is attached.  I also saw this warning in the logs not  sure
exactly what it's talking about:

Feb 3, 2011  8:14:10 PM org.apache.solr.core.Config getLuceneVersion
WARNING: the  luceneMatchVersion is not specified, defaulting to LUCENE_24
emulation.  You should at some point declare and reindex to at least 3.0,
because  2.4 emulation is deprecated and will be removed in 4.0. This
parameter  will be mandatory in 4.0.

Here is my request handler, the actual  fields here are different than what
is in mine, but I'm a little  uncomfortable publishing how our companies
search service works to the  world:

requestHandler name=standard  class=solr.SearchHandler default=true
lst  name=defaults
str  name=echoParamsexplicit/str
str  name=defTypeedismax/str
bool  name=tvtrue/bool
!-- standard field to query on  --
str name=qffield_a^2 field_b^2 field_c^4/str

!-- automatic phrase boosting! --
  str name=pffield_d^10/str

!-- boost  function --
!--
 we'll comment this out for now becuase we're passing it to
  solr as a paramter.
 Once we finalize the exact function we should move it here
and  take it out of the
 query string.
 --
!--str  name=bflog(linear(field_e,0.001,1))^10/str--
str  name=tie0.1/str
/lst
arr  name=last-components
strtvComponent/str
  /arr
/requestHandler

Anyway   Hopefully this is enough info, let me know if you need more.

  Jed.






On 02/03/2011 10:29  PM, Chris Hostetter wrote:


: I was working on an checkout of  the 3.x branch from about 6 months ago.
: Everything was working  pretty well, but we decided that we should update
and
:  get what was at the head.  However after upgrading, I am now  getting
this

FWIW: please be specific.   head of what? the 3x branch? or trunk?  what
revision in svn  does that corrispond to? (the svnversion command will
tell  you)

: HTTP ERROR 400 undefined field: *
  :
: If I clear the fl parameter (default is set to *, score) then it  works
fine
: with one big problem, no score data.   If I try and set fl=score I get
the same
: error except  it says undefined field: score?!
:
: This works great in  the older version, what changed?  I've googled for
  about
: an hour now and I can't seem to find  anything.

i can't reproduce this using either trunk  (r1067044) or 3x (r1067045)

all of these queries work  just fine...

http://localhost:8983/solr/select/?q=*
 http://localhost:8983/solr/select/?q=solrfl=*,score
 http://localhost:8983/solr/select/?q=solrfl=score
 http://localhost:8983/solr/select/?q=solr

...you'll  have to proivde us with a *lot* more details to help understand
why  you might be getting an error (like: what your configs look like,

what

the request looks like, what the full stack trace of your error  is in the
logs,  etc...)




  -Hoss

Re: hl.snippets in solr 3.1

2011-02-07 Thread Ahmet Arslan

--- On Mon, 2/7/11, alex alex.alex.alex.9...@gmail.com wrote:

 From: alex alex.alex.alex.9...@gmail.com
 Subject: hl.snippets in solr 3.1
 To: solr-user@lucene.apache.org
 Date: Monday, February 7, 2011, 7:38 PM
 hi all,
 I'm trying to get result like :
 blabla bkeyword/b blabla ...
 blablabkeyword/b blabla...

 so, I'd like to show 2  fragments.I've added these
 settings
     str
 name=hl.simple.pre![CDATA[b]]/str
     str
 name=hl.simple.post![CDATA[/b]]/str
    str
 name=f.content.hl.fragsize20/str
     str
 name=f.content.hl.snippets3/str

 but I get only 1 fragment blabla bkeyword/b
 blabla.
 Am I trying to do it right way? Is it what can be done via
 changes in config file?

 how do I add separator between fragments(like ... in this
 example)?
 thanks.

These two should be declared under the defaults section of your requestHandler.

  int name=f.content.hl.fragsize20/int
  int name=f.content.hl.snippets3/int

Where did you define them? Under the highlighting section in solrconfig.xml?

Re: HTTP ERROR 400 undefined field: *

2011-02-07 Thread Chris Hostetter

: The stack trace is attached. I also saw this warning in the logs not sure

From your attachment...

853 SEVERE: org.apache.solr.common.SolrException: undefined field: score
854 at
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142)
855 at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
856 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
857 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357)

...this is one of the key pieces of info that was missing from your
earlier email: that you are using the TermVectorComponent.

It's likely that something changed in the TVC on 3x between the two
versions you were using and thta change freaks out now on * or score
in the fl.

you still haven't given us an example of the full URLs you are using that
trigger this error. (it's posisble there is something slightly off in your
syntax - we don't know because you haven't shown us)

All in: this sounds like a newly introduced bug in TVC, please post the
details into a new Jira issue.

as to the warning you asked about...

: Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion
: WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24
: emulation. You should at some point declare and reindex to at least 3.0,
: because 2.4 emulation is deprecated and will be removed in 4.0. This parameter
: will be mandatory in 4.0.

if you look at the example configs on the 3x branch it should be
explained. it's basically just a new feature that lets you specify
which quirks of the underlying lucene code you want (so on upgrading you
are in control of wether you eliminate old quirks or not)

-Hoss

Re: hl.snippets in solr 3.1

2011-02-07 Thread alex


Ahmet Arslan wrote:

--- On Mon, 2/7/11, alex alex.alex.alex.9...@gmail.com wrote:

  

From: alex alex.alex.alex.9...@gmail.com
Subject: hl.snippets in solr 3.1
To: solr-user@lucene.apache.org
Date: Monday, February 7, 2011, 7:38 PM
hi all,
I'm trying to get result like :
blabla bkeyword/b blabla ...
blablabkeyword/b blabla...

so, I'd like to show 2  fragments.I've added these
settings
str
name=hl.simple.pre![CDATA[b]]/str
str
name=hl.simple.post![CDATA[/b]]/str
   str
name=f.content.hl.fragsize20/str
str
name=f.content.hl.snippets3/str

but I get only 1 fragment blabla bkeyword/b
blabla.
Am I trying to do it right way? Is it what can be done via
changes in config file?

how do I add separator between fragments(like ... in this
example)?
thanks.




These two should be declared under the defaults section of your requestHandler.

  int name=f.content.hl.fragsize20/int
  int name=f.content.hl.snippets3/int
 
Where did you define them? Under the highlighting section in solrconfig.xml? 



  


  
  

yes, it's in solrconfig.xml:

requestHandler name=/search  class=solr.SearchHandler
lst name=defaults
str name=defTypedismax/str
int name=rows10/int
str name=echoParamsexplicit/str

str name=qfcontent#94;0.5 title#94;1.2 /str
str name=q.alt*:*/str

bool name=hltrue/bool
str name=hl.fltitle content url/str

str name=f.content.hl.fragsize20/str
str name=f.content.hl.snippets3/str
str name=f.content.hl.alternateFieldcontent/str

str name=f.title.hl.fragsize0/str
str name=f.title.hl.alternateFieldtitle/str
str name=f.url.hl.fragsize0/str
str name=f.url.hl.alternateFieldurl/str

/lst

/requestHandler


I don't include the whole config , because there are just default values 
in it.

I can see changes if I change fragsize, but no hl.snippets.

and in schema.xml I have:

   fieldType name=text class=solr.TextField 
   analyzer type=index
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.StandardFilterFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.StopFilterFactory 
words=stopwords.txt ignoreCase=true

   enablePositionIncrements=true/
   filter class=solr.PorterStemFilterFactory/
   /analyzer
   analyzer type=query
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.StandardFilterFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.StopFilterFactory 
words=stopwords.txt ignoreCase=true

   enablePositionIncrements=true/
   filter class=solr.PorterStemFilterFactory/
   /analyzer
   /fieldType

and
   field name=content type=text stored=true indexed=true  /

How to search for special chars like ä from ae?

2011-02-07 Thread Anithya


Hi! I want to search for special chars like mäcman by giving similar worded
simple characters like maecman.
I used filter class=solr.ASCIIFoldingFilterFactory/ and I'm getting 
mäcman from macman but I'm not able to get mäcman from maecman.
Can this be done using any other filter? 
Thanks,
Anithya
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-search-for-special-chars-like-a-from-ae-tp2444921p2444921.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: hl.snippets in solr 3.1

2011-02-07 Thread Ahmet Arslan


 I can see changes if I change fragsize, but no
 hl.snippets.

May be your text is too short to generate more than one snippets?

What happens when you increase hl.maxAnalyzedChars parameter?
hl.maxAnalyzedChars=2147483647

Re: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!

2011-02-07 Thread Chris Hostetter


:   While reloading a core I got this following error, when does this
: occur ? Prior to this exception I do not see anything wrong in the logs.

well, there are realy two distinct types of errors in your log...

: 
[#|2011-02-01T13:02:36.697-0500|SEVERE|sun-appserver2.1|org.apache.solr.servlet.SolrDispatchFilter|_ThreadID=25;_ThreadName=httpWorkerThread-9001-5;_RequestID=450f6337-1f5c-42bc-a572-f0924de36b56;|org.apache.lucene.store.LockObtainFailedException:
: Lock obtain timed out: NativeFSLock@
: 
/data/solr/core/solr-data/index/lucene-7dc773a074342fa21d7d5ba09fc80678-write.lock
: at org.apache.lucene.store.Lock.obtain(Lock.java:85)
: at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565)
: at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1421)

...this is error #1, indicating that for some reason the IndexWriter Solr 
wasn't trying to create wasn't able to get a Native Filesystem lock on 
your index directory -- is it possible you have two intsances of Solr (or 
two solr cores) trying to re-use the same data directory?

(diagnosing exampley why you got this error also requires knowing what 
Filesystem you are using).

: 
[#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter
: was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE
: LEAK!!!|#]
: 
: 
[#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter
: was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE
: LEAK!!!|#]

...these errors are warning you that something very unexpected was 
discovered when the the Garbage Collector tried to cleanup the 
SolrIndexWriter -- it found that the SolrIndexWriter had never been 
formally closed.  

In normal operation, this might indicate the existence of a bug in code 
not managing it's resources properly --and in fact, it does indicate the 
existence of a bug in that evidently a Lock timed out failure doesn't 
cause the SOlrIndexWriter to be closed -- but in your case it's not really 
something to be worried about -- it's just a cascading effect of the first 
error.

-Hoss

Indexing a date from a POJO

2011-02-07 Thread Jean-Claude Dauphin

Hi,

I would like to know if the code below  is correct, because the date is not
well displayed in Luke

I have a POJO with a date defined as follow:

public class SolrPositionDTO {

  @Field
  private String address;

  @Field
  private Date beginDate;

And in the schema config file the field is defined as:
field name=beginDate type=date indexed=true stored=true /

Thanks in advance for yr help

JCD

-- 
Jean-Claude Dauphin

jc.daup...@gmail.com
jc.daup...@afus.unesco.org

http://kenai.com/projects/j-isis/
http://www.unesco.org/isis/
http://www.unesco.org/idams/
http://www.greenstone.org

Re: hl.snippets in solr 3.1

2011-02-07 Thread alex


Ahmet Arslan wrote:

I can see changes if I change fragsize, but no
hl.snippets.


May be your text is too short to generate more than one snippets?

What happens when you increase hl.maxAnalyzedChars parameter?
hl.maxAnalyzedChars=2147483647


  

It's working now. I guess, it was a problem with config file.
thanks!

RE: How to search for special chars like ä from ae?

2011-02-07 Thread Steven A Rowe

Hi Anithya,

There is a mapping file for MappingCharFilterFactory that behaves the same as 
ASCIIFoldingFilterFactory: mapping-FoldToASCII.txt, located in Solr's example 
conf/ directory in Solr 3.1+.  You can rename and then edit this file to map 
ä to ae,  ü to ue, etc. (look for WITH DIAERESIS to quickly find 
characters with umlauts in the mapping file).
 
There is a commented-out example of using MappingCharFilterFactory in Solr's 
example schema.xml.

If you are using Solr 1.4.X, you can download the mapping-FoldToASCII.txt file 
here (from the 3.x source tree):

http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/mapping-FoldToASCII.txt

Please consider donating your work back to Solr if you decide to go this route.

Good luck,
Steve

 -Original Message-
 From: Anithya [mailto:surysha...@gmail.com]
 Sent: Monday, February 07, 2011 12:09 PM
 To: solr-user@lucene.apache.org
 Subject: How to search for special chars like ä from ae?
 
 
 Hi! I want to search for special chars like mäcman by giving similar
 worded
 simple characters like maecman.
 I used filter class=solr.ASCIIFoldingFilterFactory/ and I'm getting
 mäcman from macman but I'm not able to get mäcman from maecman.
 Can this be done using any other filter?
 Thanks,
 Anithya
 --
 View this message in context: http://lucene.472066.n3.nabble.com/How-to-
 search-for-special-chars-like-a-from-ae-tp2444921p2444921.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Spatial Solr - Representing a bounding box and searching for it

2011-02-07 Thread Sepehr

Hi everyone,

I have been looking for a searching solution for spatial data and
since I have worked with Solr before, I wanted to give the spatial
features a try.

1. What is the default datum used for the LatLong type? Is it WGS 84?
2. What is the best way to represent a region (a bounding box to be
exact) and search for it? Spatial metadata records usually contains an
element that specifies the region that the record is representing. For
example North American Profile (NAP) has the following element:

  gmd:EX_GeographicBoundingBox
 gmd:westBoundLongitude
gco:Decimal-95.15605/gco:Decimal
 /gmd:westBoundLongitude
 gmd:eastBoundLongitude
gco:Decimal-74.34407/gco:Decimal
 /gmd:eastBoundLongitude
 gmd:southBoundLatitude
gco:Decimal41.436108/gco:Decimal
 /gmd:southBoundLatitude
 gmd:northBoundLatitude
gco:Decimal54.61572/gco:Decimal
 /gmd:northBoundLatitude
  /gmd:EX_GeographicBoundingBox

which define the bounding box containing the region. As far as I've
seen, spatial fields in Solr are limited to points only. I tried using
four LatLong to represent four corners of the region, but I couldn't
get the bbox query to return the correct box: adding another sfield to
the query had no effect.
I also tried to use the fq=store:[45,-94 TO 46,-93] example by
changing the store field into multivalue and putting the upper-right
and lower-left into my document and using them as the range, but that
also didn't work.

So any suggestions on how to get this working?

Sepehr

Re: dynamic fields revisited

2011-02-07 Thread gearond


Just so anyone else can know and save themselves 1/2 hour if they spend 4
minutes searching.

When putting a dynamic field into a document into an index, the name of the
field RETAINS the 'constant' part of the dynamic field name.

Example
-
If a dynamic integer field is named '*_i' in the schema.xml file,
  __and__
you insert a field names 'my_integer_i', which matches the globbed field
name '*_i',
  __then__
the name of the field will be 'my_integer_i' in the index
and in your GETs/(updating)POSTs to the index on that document and
  __NOT__
'my_integer' like I was kind of hoping that it would be :-(

I.E., the suffix (or prefix if you set it up that way,) will NOT be dropped.
I was hoping that everything except the globbing character, '*', would just
be a flag to the query processor and disappear after being 'noticed'.

Not so :-)
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/dynamic-fields-revisited-tp2161080p2447814.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: dynamic fields revisited

2011-02-07 Thread Markus Jelsma

It would be quite annoying if it behaves as you were hoping for. This way it 
is possible to use different field types (and analyzers) for the same field 
value. In faceting, for example, this can be important because you should use 
analyzed fields for q and fq but unanalyzed fields for facet.field.

The same goes for sorting and range queries where you can use the same field 
value to end up in different field types, one for sorting and one for a range 
query.

Without the prefix or suffix of the dynamic field, one must statically declare 
the 
fields beforehand and loose the dynamic advantage.

 Just so anyone else can know and save themselves 1/2 hour if they spend 4
 minutes searching.
 
 When putting a dynamic field into a document into an index, the name of the
 field RETAINS the 'constant' part of the dynamic field name.
 
 Example
 -
 If a dynamic integer field is named '*_i' in the schema.xml file,
   __and__
 you insert a field names 'my_integer_i', which matches the globbed field
 name '*_i',
   __then__
 the name of the field will be 'my_integer_i' in the index
 and in your GETs/(updating)POSTs to the index on that document and
   __NOT__
 'my_integer' like I was kind of hoping that it would be :-(
 
 I.E., the suffix (or prefix if you set it up that way,) will NOT be
 dropped. I was hoping that everything except the globbing character, '*',
 would just be a flag to the query processor and disappear after being
 'noticed'.
 
 Not so :-)

Re: DIH keeps failing during full-import

2011-02-07 Thread Erick Erickson

You're probably better off in this instance creating your own
process based on SolrJ and your jdbc-driver-of-choice. DIH
doesn't provide much in the way of fine-grained control over
all aspects of the process, and at +30 hours I suspect you
want some better control.

FWIW, SolrJ is not very hard at all to use for this kind of thing.

Best
Erick

On Mon, Feb 7, 2011 at 10:59 AM, Mark static.void@gmail.com wrote:

 Typo in subject

 On 2/7/11 7:59 AM, Mark wrote:

 I'm receiving the following exception when trying to perform a full-import
 (~30 hours). Any idea on ways I could fix this?

 Is there an easy way to use DIH to break apart a full-import into multiple
 pieces? IE 3 mini-imports instead of 1 large import?

 Thanks.




 Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource
 closeConnection
 SEVERE: Ignoring Error when closing connection
 com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
 Communications link failure during rollback(). Transaction resolution
 unknown.
at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown
 Source)
at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564)
at
 org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399)
at
 org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390)
at
 org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174)
at
 org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165)
at
 org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332)
at
 org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360)
at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
 Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource
 closeConnection
 SEVERE: Ignoring Error when closing connection
 java.sql.SQLException: Streaming result set
 com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements may
 be issued when any streaming result sets are open and in use on a given
 connection. Ensure that you have called .close() on any active streaming
 result sets before attempting more queries.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931)
at
 com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620)
at
 com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564)
at
 org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399)
at
 org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390)
at
 org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174)
at
 org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332)
at
 org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360)
at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
 Feb 7, 2011 7:03:29 AM org.apache.solr.handler.dataimport.JdbcDataSource
 closeConnection
 SEVERE: Ignoring Error when closing connection
 com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
 Communications link failure during rollback(). Transaction resolution
 unknown.
at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown
 Source)
at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at

Re: geodist and spacial search

2011-02-07 Thread Eric Grobler

Thanks Bill,

much simpler :-)



On Sat, Feb 5, 2011 at 3:56 AM, Bill Bell billnb...@gmail.com wrote:

 Why not just:

 q=*:*
 fq={!bbox}
 sfield=store
 pt=49.45031,11.077721
 d=40
 fl=store
 sort=geodist() asc


 http://localhost:8983/solr/select?q=*:*sfield=storept=49.45031,11.077721;
 d=40fq={!bbox}sort=geodist%28%29%20asc

 That will sort, and filter up to 40km.

 No need for the

 fq={!func}geodist()
 sfield=store
 pt=49.45031,11.077721


 Bill




 On 2/4/11 4:30 AM, Eric Grobler impalah...@googlemail.com wrote:

 Hi Grant,
 
 Thanks for the tip
 This seems to work:
 
 q=*:*
 fq={!func}geodist()
 sfield=store
 pt=49.45031,11.077721
 
 fq={!bbox}
 sfield=store
 pt=49.45031,11.077721
 d=40
 
 fl=store
 sort=geodist() asc
 
 
 On Thu, Feb 3, 2011 at 7:46 PM, Grant Ingersoll gsing...@apache.org
 wrote:
 
  Use a filter query?  See the {!geofilt} stuff on the wiki page.  That
 gives
  you your filter to restrict down your result set, then you can sort by
 exact
  distance to get your sort of just those docs that make it through the
  filter.
 
 
  On Feb 3, 2011, at 10:24 AM, Eric Grobler wrote:
 
   Hi Erick,
  
   Thanks I saw that example, but I am trying to sort by distance AND
  specify
   the max distance in 1 query.
  
   The reason is:
   running bbox on 2 million documents with a 20km distance takes only
  200ms.
   Sorting 2 million documents by distance takes over 1.5 seconds!
  
   So it will be much faster for solr to first filter the 20km documents
 and
   then to sort them.
  
   Regards
   Ericz
  
   On Thu, Feb 3, 2011 at 1:27 PM, Erick Erickson
 erickerick...@gmail.com
  wrote:
  
   Further down that very page G...
  
   Here's an example of sorting by distance ascending:
  
-
  
...q=*:*sfield=storept=45.15,-93.85sort=geodist()
   asc
  
 
 
 http://localhost:8983/solr/select?wt=jsonindent=truefl=name,storeq=*:*
 sfield=storept=45.15,-93.85sort=geodist()%20asc
  
  
  
  
  
   The key is just the sort=geodist(), I'm pretty sure that's
 independent
  of
   the bbox, but
   I could be wrong.
  
   Best
   Erick
  
   On Wed, Feb 2, 2011 at 11:18 AM, Eric Grobler 
  impalah...@googlemail.com
   wrote:
  
   Hi
  
   In http://wiki.apache.org/solr/SpatialSearch
   there is an example of a bbox filter and a geodist function.
  
   Is it possible to do a bbox filter and sort by distance - combine
 the
   two?
  
   Thanks
   Ericz
  
  
 
  --
  Grant Ingersoll
  http://www.lucidimagination.com/
 
  Search the Lucene ecosystem docs using Solr/Lucene:
  http://www.lucidimagination.com/search

Re: dynamic fields revisited

2011-02-07 Thread Bill Bell

You can change the match to be my* and then insert the name you want. 

Bill Bell
Sent from mobile


On Feb 7, 2011, at 4:15 PM, gearond gear...@sbcglobal.net wrote:

 
 Just so anyone else can know and save themselves 1/2 hour if they spend 4
 minutes searching.
 
 When putting a dynamic field into a document into an index, the name of the
 field RETAINS the 'constant' part of the dynamic field name.
 
 Example
 -
 If a dynamic integer field is named '*_i' in the schema.xml file,
  __and__
 you insert a field names 'my_integer_i', which matches the globbed field
 name '*_i',
  __then__
 the name of the field will be 'my_integer_i' in the index
 and in your GETs/(updating)POSTs to the index on that document and
  __NOT__
 'my_integer' like I was kind of hoping that it would be :-(
 
 I.E., the suffix (or prefix if you set it up that way,) will NOT be dropped.
 I was hoping that everything except the globbing character, '*', would just
 be a flag to the query processor and disappear after being 'noticed'.
 
 Not so :-)
 -- 
 View this message in context: 
 http://lucene.472066.n3.nabble.com/dynamic-fields-revisited-tp2161080p2447814.html
 Sent from the Solr - User mailing list archive at Nabble.com.

q.alt=: for every request?

2011-02-07 Thread Chamnap Chhorn

Hi,

I use dismax handler with solr 1.4.
Sometimes, my request comes with q and fq, and others doesn't come with q
(only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every
request? Does it have side effects on performance?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-02-07 Thread Simon Wistow

On Mon, Feb 07, 2011 at 02:06:00PM +0100, Markus Jelsma said:
 Heap usage can spike after a commit. Existing caches are still in use and new 
 caches are being generated and/or auto warmed. Can you confirm this is the 
 case?

We see spikes after replication which I suspect is, as you say, because 
of the ensuing commit.

What we seem to have found is that when we weren't using the Concurrent 
GC stop-the-world gc runs would kill the app. Now that we're using CMS 
we occasionally find ourselves in situations where the app still has 
memory left over but the load on the machine spikes, the GC duty cycle 
goes to 100 and the app never recovers.

Restarting usually helps but sometimes we have to take the machine out 
of the laod balancer, wait for a number of minutes and then out it back 
in.

We're working on two hypotheses 

Firstly - we're CPU bound somehow and that at some point we cross some 
threshhold and GC or something else is just unable to to keep up. So 
whilst it looks like instantaneous death of the app it's actually 
gradual resource exhaustion where the definition of 'gradual' is 'a very 
short period of time' (as opposed to some cataclysmic infinite loop bug 
somewhere).

Either that or ... Secondly - there's some sort of Query Of Death that 
kills machines. We just haven't found it yet, even when replaying logs. 

Or some combination of both. Or other things. It's maddeningly 
frustrating.

We're also got to try deploying a custom solr.war and try using the 
MMapDirectory to see if that helps with anything.

Re: Searching for negative numbers very slow

2011-02-07 Thread Simon Wistow

On Fri, Jan 28, 2011 at 12:29:18PM -0500, Yonik Seeley said:
 That's odd - there should be nothing special about negative numbers.
 Here are a couple of ideas:
   - if you have a really big index and querying by a negative number
 is much more rare, it could just be that part of the index wasn't
 cached by the OS and so the query needs to hit the disk.  This can
 happen with any term and a really big index - nothing special for
 negatives here.
  - if -1 is a really common value, it can be slower.  is fq=uid:\-2 or
 other negative numbers really slow also?

This was my first thought but -1 is relatively common but we have other 
numbers just as common. 


Interestingly enough

fq=uid:-1
fq=foo:bar
fq=alpha:omega

is much (4x) slower than

q=uid:-1 AND foo:bar AND alpha:omega

but only when searching for that number.

I'm going to wave my hands here and say something like Maybe something 
to do with the field caches?

Re: q.alt=: for every request?

2011-02-07 Thread Markus Jelsma

There is no measurable performance penalty when setting the parameter, except 
maybe the execution of the query with a high value for rows. To make things 
easy, you can define q.alt=*:* as default in your request handler. No need to 
specifiy it in the URL.


 Hi,
 
 I use dismax handler with solr 1.4.
 Sometimes, my request comes with q and fq, and others doesn't come with q
 (only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every
 request? Does it have side effects on performance?

Solr Analysis Package

2011-02-07 Thread Tucker Barbour

I'd like to use the filter factories in the org.apache.solr.analysis package
for tokenizing text in a separate application. I need to chain a couple
tokenizers together like Solr does on indexing and query parsing. I have
looked into the TokenizerChain class to do this. I have successfully
implemented a tokenization chain, but was wondering if there is an
established way to do this. I just hacked together something that happened
to work. Below is a code snippet. Any advise would be appreciated.
Dependencies: solr-core-1.4.0, lucene-core-2.9.3, lucene-snowball-2.9.3. I
am not tied to these and could use different versions.
P.S. Is this more of a question for the solr-dev mailing list?

code
TokenizerFactory tokenizer = new WhitespaceTokenizerFactory();
MapString,String args = new HashMapString,String();
SnowballPorterFilterFactory porterFilter = new
SnowballPorterFilterFactory();
porterFilter.init(args);

args = new HashMapString,String();
args.put(generateWordParts, 1);
args.put(generateNumberParts, 1);
args.put(catenateWords, 1);
args.put(catenateNumbers, 1);
args.put(catenateAll, 0);
WordDelimiterFilterFactory wordFilter = new WordDelimiterFilterFactory();
wordFilter.init(args);

LowerCaseFilterFactory lowercaseFilter = new LowerCaseFilterFactory();
TokenFilterFactory[] filters = new TokenFilterFactory[] {
wordFilter, lowercaseFilter, porterFilter
};
TokenizerChain chain = new TokenizerChain(tokenizer, filters);
TokenStream stream = chain.tokenStream(null, new
StringReader(builder.toString()));
TermAttribute tm =
(TermAttribute)stream.getAttribute(TermAttribute.class);
while (stream.incrementToken()) {
System.out.println(tm.term());
}
/code

Re: Performance optimization of Proximity/Wildcard searches

2011-02-07 Thread Otis Gospodnetic

Hi,


Yes, assuming you didn't change the index files, say by optimizing the index, 
the hot portions of the index should remain in the OS cache unless something 
else kicked them out.

Re other thread - I don't think I have those messages any more.

Otis
---
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Salman Akram salman.ak...@northbaysolutions.net
 To: solr-user@lucene.apache.org
 Sent: Mon, February 7, 2011 2:49:44 AM
 Subject: Re: Performance optimization of Proximity/Wildcard searches
 
 Only couple of thousand documents are added daily so the old OS cache  should
 still be useful since old documents remain same, right?
 
 Also  can you please comment on my other thread related to Term  Vectors?
 Thanks!
 
 On Sat, Feb 5, 2011 at 8:40 PM, Otis Gospodnetic  otis_gospodne...@yahoo.com
   wrote:
 
  Yes, OS cache mostly remains (obviously index files that are  no longer
  around
  are going to remain the OS cache for a while,  but will be useless and
  gradually
  replaced by new index  files).
  How long warmup takes is not relevant here, but what queries you  use to
  warm up
  the index and how much you auto-warm the  caches.
 
  Otis
  
  Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
  Lucene ecosystem search :: http://search-lucene.com/
 
 
 
  - Original  Message 
   From: Salman Akram salman.ak...@northbaysolutions.net
To: solr-user@lucene.apache.org
Sent: Sat, February 5, 2011 4:06:54 AM
   Subject: Re:  Performance optimization of Proximity/Wildcard searches
  
Correct me if I am wrong.
  
   Commit in index flushes  SOLR cache but of  course OS cache would still be
   useful? If a  an index is updated every hour  then a warm up that takes
   less
   than 5 mins should be more than enough,  right?
   
   On Sat, Feb 5, 2011 at 7:42 AM, Otis Gospodnetic 
  otis_gospodne...@yahoo.com
  wrote:
  
Salman,

Warming up may be useful if your  caches are getting  decent hit ratios.
Plus, you
are warming  up  the OS cache when you warm up.
   
 Otis

 Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
Lucene ecosystem  search :: http://search-lucene.com/
   
   

- Original  Message 
  From: Salman Akram salman.ak...@northbaysolutions.net
   To: solr-user@lucene.apache.org
   Sent: Fri, February 4, 2011 3:33:41 PM
  Subject: Re:  Performance optimization of Proximity/Wildcard  
searches

  I know so we are  not really using it for regular warm-ups (in any
   case
  index
 is updated on hourly basis). Just  tried  few times to compare
  results.
  The
 issue is I am not  even sure if warming up is  useful for such
   regular
   updates.


 
 On Fri, Feb 4, 2011  at 5:16 PM, Otis   Gospodnetic 
otis_gospodne...@yahoo.com
 wrote:

   Salman,
  
   I only skimmed your email, but wanted  to say that  this part
   sounds a
little
   suspicious:
  
 Our warm up script currently  executes  all distinct  queries in
  our
 logs
having  count  5. It was run  yesterday (with all the   indexing
 update
   every
 
  It sounds   like this will make  warmup take a long time,
  assuming
  you
  have
   more than a  handful distinct  queries in your logs.
  
  Otis

   Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
   Lucene ecosystem  search  :: http://search-lucene.com/
 
   
 
  -  Original  Message  
   From: Salman  Akram salman.ak...@northbaysolutions.net
  To: solr-user@lucene.apache.org; t...@statsbiblioteket.dk
  Sent: Tue, January 25, 2011 6:32:48 AM
 Subject: Re: Performance  optimization of  Proximity/Wildcard
  searches
  
By warmed  index you  only mean warming the  SOLR cache or OS
  cache? As
I
  said
   our index is  updated every hour so I am  not sure how much SOLR
cache
   would
 be helpful but OS cache should still be  helpful, right?

   I  haven't  compared the results   with a proper script but from
  manual
testing
here  are  some of the observations.

   'Recent' queries which  are  in  cache of  course return
  immediately
(only
   if
 they are  exactly same - even  if they took 3-4 mins first
  time).   I
 will
  need
to test how  many recent  queries stay in   cache but still this
  would
 work
   only
   for very commonqueries.  User can run different queries and I
  want
 at
least
them to be at 'acceptable'  level  (5-10 secs) even if   not very
  fast.
  
 Our warm up script currently   executes all distinct  queries in
   our
logs
having count  5. It  was  run  yesterday

Re: nested faceting ?

2011-02-07 Thread cyang2010


I think what you are trying to achieve is called taxonomy facet.

There is a solution for that.  Check for the slides for Taxonomy faceting.
http://www.lucidimagination.com/solutions/webcasts/faceting


However, i don't know if you are able to render the hierachy all at once. 
The solution i point is for one hierachry at a time.

devices (100)
accessories (1000)

if device is selected/clicked, then show --

Samsung (50)
Sharp(50)


If Accessories is selected/clicked, then show --
Samsung (500)
Apple(500) 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/nested-faceting-tp2389841p2449439.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr n00b question: writing a custom QueryComponent

2011-02-07 Thread Ishwar

Hi all,

Been a solr user for a while now, and now I need to add some functionality to 
solr for which I'm trying to write a custom QueryComponent. Couldn't get much 
help from websearch. So, turning to solr-user for help.

I'm implementing search functionality for  (micro)blog aggregation. We use solr 
1.4.1. In the current solr config, the title and content fields are both 
indexed and stored in solr. Storing takes up a lot of space, even with 
compression. I'd like to store the title and description field in solr in mysql 
and retrieve these fields in results from MySQL with an id lookup.

Using the DataImportHandler won't work because we store just the title and 
content fields in MySQL. The rest of the fields are in solr itself.

I wrote a custom component by extending QueryComponent, and overriding only the 
finishStage(ResponseBuilder) function where I try to retrieve the necessary 
records from MySQL. This is how the new QueryComponent is specified in 
solrconfig.xml

searchComponent name=query 
class=org.apache.solr.handler.component.TestSolr /


I see that the component is getting loaded from the solr debug output
lst name=prepare
double name=time1.0/double
lst name=org.apache.solr.handler.component.TestSolr
double name=time0.0/double
/lst
...

But the strange thing is that the finishStage() function is not being called 
before returning results. What am I missing?

Secondly, functions like ResponseBuilder._responseDocs are visible only in the 
package org.apache.solr.handler.component. How do I access the results in my 
package?

If you folks can give me links to a wiki or some sample custom QueryComponent, 
that'll be great.

--
Thanks in advance.
Ishwar.


Just another resurrected Neozoic Archosaur comics.
http://www.flickr.com/photos/mojosaurus/sets/72157600257724083/

Re: q.alt=: for every request?

2011-02-07 Thread Paul Libbrecht

To be able to see this well, it would be lovely to have a switch that would 
activate a logging of the query expansion result. The Dismax QParserPlugin is 
particularly powerful in there so it'd be nice to see what's happening.

Any logging category I need to activate?

paul


Le 8 févr. 2011 à 03:22, Markus Jelsma a écrit :

 There is no measurable performance penalty when setting the parameter, except 
 maybe the execution of the query with a high value for rows. To make things 
 easy, you can define q.alt=*:* as default in your request handler. No need to 
 specifiy it in the URL.
 
 
 Hi,
 
 I use dismax handler with solr 1.4.
 Sometimes, my request comes with q and fq, and others doesn't come with q
 (only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every
 request? Does it have side effects on performance?

37 matches

Mail list logo