Re: And results before Or results

2012-05-11 Thread Jack Krupansky
Pass the &debugQuery=true request option and then look at the "explain" 
section of the response to see how a three-term result score turned out to 
be less than a two-term result score.


-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Friday, May 11, 2012 10:54 PM
To: solr-user@lucene.apache.org
Subject: Re: And results before Or results

I vaguely recall seeing this situation myself a couple of years ago. I think
it was because there were multiple occurrences of the pair of terms in a
single document vs. a lesser number of occurrences of all three of the terms
in a single document.

-- Jack Krupansky

-Original Message- 
From: Ahmet Arslan

Sent: Friday, May 11, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: And results before Or results


I want to have a strick enforcement
that In case of a 3 word search, those
results that match all 3 term should be presented ahead of
those that match
2 terms when I set mm=2.

I have seen quite some cases where, those results that match
2 out of 3
words appear ahead of those matching all 3 words.


Yes you are right that can happen. See Jan's magic solution -that uses map
function- for that. http://search-lucene.com/m/nK6t9j1fuc2/ 



Re: Replication issues after machine failure

2012-05-11 Thread Mark Miller
So it's easy to reproduce? What do you mean restored from a prior state?

What snapshot are you on these days for future ref?

You have double checked to make sure that shard is listed as ACTIVE right?

On May 11, 2012, at 4:55 PM, Jamie Johnson wrote:

> I've had a few instances where a machine has needed to be restored
> from a prior state.  After doing so and firing up solr again I've had
> instances where replication doesn't seem to be working properly.  I
> have not seen any failures in logs (will have to keep a closer eye on
> this) but when this happens and I execute a query against each with
> distrib=false I am seeing the following counts
> 
> Shard @ host1(shard1) returned 95150
> Shard @ host2(shard1) returned 95150
> Shard @ host2(shard4) returned 94311
> Shard @ host3(shard4) returned 8468
> Shard @ host3(shard5) returned 8303
> Shard @ host1(shard5) returned 96054
> Shard @ host1(shard2) returned 95620
> Shard @ host2(shard2) returned 95620
> Shard @ host2(shard3) returned 93195
> Shard @ host3(shard3) returned 8336
> Shard @ host3(shard6) returned 8309
> Shard @ host1(shard6) returned 96036
> 
> 
> in this case host3 is what failed and as you can see everything on
> host3 is significantly less than what the leader has.  Has anyone else
> experienced this?

- Mark Miller
lucidimagination.com













Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-11 Thread Mark Miller
Yeah, 9 times out of 10, this error is a 404 - which wouldn't be logged 
anywhere.

On May 11, 2012, at 6:12 PM, Ravi Solr wrote:

> Guys, just to give you an update, we think we "might" have found the
> issue. iptables was enabled on one query server and disabled on the
> other. The server where iptables is enabled is the one having issues,
> we disabled the iptables today to test out the theory that the
> iptables might be causing this issue of null/empty response. If the
> server holds up during the weekend then we have the culprit :-)
> 
> Thanks to all of you who helped me out. Stay tuned.
> 
> Ravi Kiran
> 
> On Fri, May 11, 2012 at 1:23 AM, Shawn Heisey  wrote:
>> On 5/10/2012 4:17 PM, Ravi Solr wrote:
>>> 
>>> Thanks for responding Mr. Heisey... I don't see any parsing errors in
>>> my log but I see lot of exceptions like the one listed belowonce
>>> an exception like this happens weirdness ensues. For example - To
>>> check sanity I queried for uniquekey:"111" from the solr admin GUI it
>>> gave back numFound equal to all docs in that index i.e. its not
>>> searching for that uniquekey at all, it blindly matched all docs.
>>> However, once you restart the server the same index without any change
>>> works perfectly returning only one doc in numFound when you search for
>>> uniquekey:"111"...I tried everything from reindexing, copying index
>>> from another sane server, delete entire index and reindex from scratch
>>> etc but in vain, it works for roughly 24 hours and then starts
>>> throwing the same error no matter what the query is.
>>> 
>>> 
>>> 
>>> [#|2012-05-10T13:27:14.071-0400|SEVERE|sun-appserver2.1.1|xxx.xxx.xxx.xxx|_ThreadID=21;_ThreadName=httpSSLWorkerThread-9001-6;_RequestID=d44462e7-576b-4391-a499-c65da33e3293;|Error
>>> searching data for section Local
>>> org.apache.solr.client.solrj.SolrServerException: Error executing query
>>>at
>>> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
>>>at
>>> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:311)
>>>at xxx.xxx.xxx.xxx(FeedController.java:621)
>>>at xxx.xxx.xxx.xxx(FeedController.java:402)
>> 
>> 
>> This is still saying solrj.  Unless I am completely misunderstanding the way
>> things work, which I will freely admit is possible, this is the client code.
>>  Do you have anything in the log files from Solr (the server)?  I don't have
>> a lot of experience with Tomcat, because I run my Solr under jetty as
>> included in the example.  It looks like the client is running under Tomcat,
>> though I suppose you might be running Solr under a different container.
>> 
>> Thanks,
>> Shawn
>> 

- Mark Miller
lucidimagination.com













Re: And results before Or results

2012-05-11 Thread Jack Krupansky
I vaguely recall seeing this situation myself a couple of years ago. I think 
it was because there were multiple occurrences of the pair of terms in a 
single document vs. a lesser number of occurrences of all three of the terms 
in a single document.


-- Jack Krupansky

-Original Message- 
From: Ahmet Arslan

Sent: Friday, May 11, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: And results before Or results


I want to have a strick enforcement
that In case of a 3 word search, those
results that match all 3 term should be presented ahead of
those that match
2 terms when I set mm=2.

I have seen quite some cases where, those results that match
2 out of 3
words appear ahead of those matching all 3 words.


Yes you are right that can happen. See Jan's magic solution -that uses map 
function- for that. http://search-lucene.com/m/nK6t9j1fuc2/ 



Re: Join Query syntax

2012-05-11 Thread Sohail Aboobaker
Is it available in Solr 3.5 or is there a way to do something similar in
Solr 3.5,


Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-11 Thread Ravi Solr
Guys, just to give you an update, we think we "might" have found the
issue. iptables was enabled on one query server and disabled on the
other. The server where iptables is enabled is the one having issues,
we disabled the iptables today to test out the theory that the
iptables might be causing this issue of null/empty response. If the
server holds up during the weekend then we have the culprit :-)

Thanks to all of you who helped me out. Stay tuned.

Ravi Kiran

On Fri, May 11, 2012 at 1:23 AM, Shawn Heisey  wrote:
> On 5/10/2012 4:17 PM, Ravi Solr wrote:
>>
>> Thanks for responding Mr. Heisey... I don't see any parsing errors in
>> my log but I see lot of exceptions like the one listed belowonce
>> an exception like this happens weirdness ensues. For example - To
>> check sanity I queried for uniquekey:"111" from the solr admin GUI it
>> gave back numFound equal to all docs in that index i.e. its not
>> searching for that uniquekey at all, it blindly matched all docs.
>> However, once you restart the server the same index without any change
>> works perfectly returning only one doc in numFound when you search for
>> uniquekey:"111"...I tried everything from reindexing, copying index
>> from another sane server, delete entire index and reindex from scratch
>> etc but in vain, it works for roughly 24 hours and then starts
>> throwing the same error no matter what the query is.
>>
>>
>>
>> [#|2012-05-10T13:27:14.071-0400|SEVERE|sun-appserver2.1.1|xxx.xxx.xxx.xxx|_ThreadID=21;_ThreadName=httpSSLWorkerThread-9001-6;_RequestID=d44462e7-576b-4391-a499-c65da33e3293;|Error
>> searching data for section Local
>> org.apache.solr.client.solrj.SolrServerException: Error executing query
>>        at
>> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
>>        at
>> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:311)
>>        at xxx.xxx.xxx.xxx(FeedController.java:621)
>>        at xxx.xxx.xxx.xxx(FeedController.java:402)
>
>
> This is still saying solrj.  Unless I am completely misunderstanding the way
> things work, which I will freely admit is possible, this is the client code.
>  Do you have anything in the log files from Solr (the server)?  I don't have
> a lot of experience with Tomcat, because I run my Solr under jetty as
> included in the example.  It looks like the client is running under Tomcat,
> though I suppose you might be running Solr under a different container.
>
> Thanks,
> Shawn
>


searching when in a solr-component?

2012-05-11 Thread Paul Libbrecht
Hello SOLR experts,

can I see the same index while responding another query?
If yes how?

thanks in advance

Paul


Re: ConcurrentUpdateSolrServer and unable to override default http settings

2012-05-11 Thread Gopal Patwa
Is this possible to make this improvement, so it can save lot of time and
code for using ConcurrentUpdateSolrServer with allowing to override default
http settings

On Sun, Apr 29, 2012 at 8:56 PM, Gopal Patwa  wrote:

> In Solr4j client trunk build for 4.0, ConcurrentUpdateSolrServer class
> does not allow to override default http settings
> like HttpConnectionParams.setConnectionTimeout, 
> HttpConnectionParams.setSoTimeout, DefaultMaxConnectionsPerHost
>
>
> Due to HttpSolrServer is not accessible from ConcurrentUpdateSolrServer
> class,  since most of time you just need to override default http settings,
> I know we can pass HttpClient but it would be nice
> if ConcurrentUpdateSolrServer can allow to get access to HttpSolrServer
> from some getter method.
>
> Otherwise anyone who need to override default http settings need to pass
> HttpClient.
>
>
> -Gopal Patwa
>
>
>
>
>


Re: How to change data subdirectory in Solr

2012-05-11 Thread Erik Hatcher
It isn't possible to point at just the index directory like this.  Solr uses a 
"data dir" and requires the main index be in index/ under that.  There are 
other things that can be put into the data directory besides just the main 
Lucene index, such as side car spell check indexes and thus there is more to it 
than just the index directory itself.

Erik

On May 11, 2012, at 15:53 , Vitor M. Barbosa wrote:

> I'm trying to set up Solr to work with some existing Lucene indexes, which
> are under this folder structure:
> /D:\indexes\core_name\/
> But Solr always tries to look for /D:\indexes\core_name\index/, even after
> changing the dataDir in solrconfig.xml *and *in solr.xml.
> 
> I know I can create symlinks in those folders to make it work, but I don't
> think I'll be able to do this on our server, so I really need to use those
> Lucene indexes in Solr without making any changes at all to them.
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-change-data-subdirectory-in-Solr-tp3980872.html
> Sent from the Solr - User mailing list archive at Nabble.com.



How to change data subdirectory in Solr

2012-05-11 Thread Vitor M. Barbosa
I'm trying to set up Solr to work with some existing Lucene indexes, which
are under this folder structure:
/D:\indexes\core_name\/
But Solr always tries to look for /D:\indexes\core_name\index/, even after
changing the dataDir in solrconfig.xml *and *in solr.xml.

I know I can create symlinks in those folders to make it work, but I don't
think I'll be able to do this on our server, so I really need to use those
Lucene indexes in Solr without making any changes at all to them.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-change-data-subdirectory-in-Solr-tp3980872.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: And results before Or results

2012-05-11 Thread Ahmet Arslan
> I want to have a strick enforcement
> that In case of a 3 word search, those
> results that match all 3 term should be presented ahead of
> those that match
> 2 terms when I set mm=2.
> 
> I have seen quite some cases where, those results that match
> 2 out of 3
> words appear ahead of those matching all 3 words.

Yes you are right that can happen. See Jan's magic solution -that uses map 
function- for that. http://search-lucene.com/m/nK6t9j1fuc2/



Re: Problems with Memory

2012-05-11 Thread Carlos Alberto Schneider
Good afternoon,

It may be  a problem in your app

If your crawler is a java app, try to limit the amount of memory it uses,
ex:

java -jar my-app-with-dependencies.jar   -Xms64m -Xmx128m -XX:NewSize=64m
-XX:MaxNewSize=64m -XX:PermSize=128m -XX:MaxPermSize=128m ;

Look for this parameters in the script that start solr too.


On Fri, May 11, 2012 at 4:09 PM, Thiago wrote:

> I'm having problems with memory when I'm using Solr. I have an application
> that crawl the web for some documents. It does a lot of consecutively
> indexing. But after some days of crawling, I'm having problems with memory.
> My Java process is consuming a lot of memory and it doesn't seems OK. My
> computer is starting swap and my crawler is running  very slow. My
> professor
> told me that it is using the cache. What can I do? Is there any option that
> I should choose to solve this problem?
>
> Thanks in advance
>
> Thiago
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Problems-with-Memory-tp3980765.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Carlos Alberto Schneider
Informant - (47) 38010919 - 9904-5517


Problems with Memory

2012-05-11 Thread Thiago
I'm having problems with memory when I'm using Solr. I have an application
that crawl the web for some documents. It does a lot of consecutively
indexing. But after some days of crawling, I'm having problems with memory.
My Java process is consuming a lot of memory and it doesn't seems OK. My
computer is starting swap and my crawler is running  very slow. My professor
told me that it is using the cache. What can I do? Is there any option that
I should choose to solve this problem?

Thanks in advance

Thiago

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-with-Memory-tp3980765.html
Sent from the Solr - User mailing list archive at Nabble.com.


Is it possible to index pdfs and database into single document?

2012-05-11 Thread anarchos78
Hello again,
I can index pdf using:
*data-config.xml*












  
 

I can also index a database using:
*data-config.xml*




  
  
 
  
  
  

 









 





  
  
  

 




  
  
  

 




  
  
  
 








For the above I have:
*schema.xml(fields)*

 
   
  
  
  
  
  
 
fake_id
biog



But when I am using the below data-config.xml indexing fails:

*data-config.xml*





  
  
 
  
  
  

 









 





  
  
  

 




  
  
  

 




  
  
  
 
















 


*The log file is outputting:*

SEVERE: Exception while processing: f document :
null:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to execute query: C:\solr\tomcat\..\solr\docu\dinos.pdf Processing Document
# 36
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:103)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You
have an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near
'C:\solr\tomcat\..\solr\docu\dinos.pdf' at line 1
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4096)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4028)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2677)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2627)
at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:841)
at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:681)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:246)

Re: Question about cache

2012-05-11 Thread Shawn Heisey

On 5/11/2012 9:30 AM, Anderson vasconcelos wrote:

HI  Kuli

The free -m command gives me
total   used   free sharedbuffers
cached
Mem:  9991   9934 57  0 75   5759
-/+ buffers/cache:   4099   5892
Swap: 8189   3395   4793

You can see that has only 57m free and 5GB cached.

In top command, the glassfish process used 79,7% of memory:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
COMMAND
  4336 root  21   0 29.7g 7.8g 4.0g S 0.3  79.7   5349:14
java


If i increase the memory of server for more 2GB, the SO will be use this
additional 2GB in cache? I need to increse the memory size?


Are you having a problem you need to track down, or are you just raising 
a concern because your memory usage is not what you expected?


It is 100% normal for a Linux system to show only a few megabytes of 
memory free.  To make things run faster, the OS caches disk data using 
memory that is not directly allocated to programs or the OS itself.  If 
a program requests memory, the OS will allocate it immediately, it 
simply forgets the least used part of the cache.


Windows does this too, but Microsoft decided that novice users would 
freak out if the task manager were to give users the true picture of 
memory usage, so they exclude disk cache when calculating free memory.  
It's not really a lie, just not the full true picture.


A recent version of Solr (3.5, if I remember right) made a major change 
in the way that the index files are accessed.  The way things are done 
now is almost always faster, but it makes the memory usage in the top 
command completely useless.  The VIRT memory size includes all of your 
index files, plus all the memory that the java process is capable of 
allocating, plus a little that i can't quite account for.  The RES size 
is also bigger than expected, and I'm not sure why.


Based on the numbers above, I am guessing that your indexes take up 
15-20GB of disk space.  For best performance, you would want a machine 
with at least 24GB of RAM so that your entire index can fit into the OS 
disk cache.  The 10GB you have (which leaves the 5.8 GB for disk cache 
as you have seen) may be good enough to cache the frequently accessed 
portions of your index, so your performance might be just fine.


Thanks,
Shawn



RE: Indexing data from pdf

2012-05-11 Thread Dyer, James
The document you tried to index has an "id" but not a "fake_id".  Because 
"fake_id" is your index uniqueKey, you have to include it in every document you 
index.  Your most likely fix for this is to use a Transformer to generate a 
"fake_id".  You might get away with changing this:



to this:



This assumes, of course, for these pdf documents the "fake_id" should always be 
the same as the "id".

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: anarchos78 [mailto:rigasathanasio...@hotmail.com] 
Sent: Friday, May 11, 2012 12:32 PM
To: solr-user@lucene.apache.org
Subject: RE: Indexing data from pdf

I have included the extras and I am getting the following:
*From Solr:*


0
2


data-config.xml


full-import
idle


0
2
0
2012-05-11 20:21:50
Indexing completed. Added/Updated: 0 documents. Deleted 0
documents.
2012-05-11 20:21:51
01
0:0:1.284This
response format is experimental.  It is likely to change in the
future.



*The log file:*
org.apache.solr.handler.dataimport.SolrWriter upload
WARNING: Error creating document : SolrInputDocument[{id=id(1.0)={1},
biog=biog(1.0)={Dinos Michailidis
Dinos Michailidis (1355 or 1356 – 1418) was a medieval Egyptian writer and
mathematician born in a village in the Nile Delta. He is the author of
Subh al-a 'sha, a fourteen volume encyclopedia in Arabic, which included a
section on cryptology. This information was attributed to Taj ad-Din Ali
ibn ad-Duraihim ben Muhammad ath-Tha 'alibi al-Mausili who lived from 1312
to 1361, but whose writings on cryptology have been lost. The list of
ciphers in this work included both substitution and transposition, and for
the first time, a cipher with multiple substitutions for each plaintext
letter.
Also traced to Ibn al-Duraihim is an exposition on and worked example of
cryptanalysis, including the use of tables of letter frequencies and sets of
letters which can not occur together in one word. 


}, model=model(1.0)={patata}}]
org.apache.solr.common.SolrException: [doc=null] missing required field:
fake_id
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:355)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
at 
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:66)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:293)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:723)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)

*The data-config.xml:*



  










  


*The schema.xml (fields):*


 
   
  
  
  
  
  


fake_id
text

What is going wrong now? I have included all the required fields in the
schema.xml.
Thank you. 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-data-from-pdf-tp3979876p3980571.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Indexing data from pdf

2012-05-11 Thread anarchos78
I have included the extras and I am getting the following:
*From Solr:*


0
2


data-config.xml


full-import
idle


0
2
0
2012-05-11 20:21:50
Indexing completed. Added/Updated: 0 documents. Deleted 0
documents.
2012-05-11 20:21:51
01
0:0:1.284This
response format is experimental.  It is likely to change in the
future.



*The log file:*
org.apache.solr.handler.dataimport.SolrWriter upload
WARNING: Error creating document : SolrInputDocument[{id=id(1.0)={1},
biog=biog(1.0)={Dinos Michailidis
Dinos Michailidis (1355 or 1356 – 1418) was a medieval Egyptian writer and
mathematician born in a village in the Nile Delta. He is the author of
Subh al-a 'sha, a fourteen volume encyclopedia in Arabic, which included a
section on cryptology. This information was attributed to Taj ad-Din Ali
ibn ad-Duraihim ben Muhammad ath-Tha 'alibi al-Mausili who lived from 1312
to 1361, but whose writings on cryptology have been lost. The list of
ciphers in this work included both substitution and transposition, and for
the first time, a cipher with multiple substitutions for each plaintext
letter.
Also traced to Ibn al-Duraihim is an exposition on and worked example of
cryptanalysis, including the use of tables of letter frequencies and sets of
letters which can not occur together in one word. 


}, model=model(1.0)={patata}}]
org.apache.solr.common.SolrException: [doc=null] missing required field:
fake_id
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:355)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
at 
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:66)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:293)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:723)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)

*The data-config.xml:*



  










  


*The schema.xml (fields):*


 
   
  
  
  
  
  


fake_id
text

What is going wrong now? I have included all the required fields in the
schema.xml.
Thank you. 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-data-from-pdf-tp3979876p3980571.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Indexing data from pdf

2012-05-11 Thread Dyer, James
It looks like maybe you do not have "apache-solr-dataimporthandler-extras.jar" 
in your classpath.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: anarchos78 [mailto:rigasathanasio...@hotmail.com]
Sent: Friday, May 11, 2012 11:00 AM
To: solr-user@lucene.apache.org
Subject: Re: Indexing data from pdf

Now I am getting the following:
*From Solr:*



0
1


data-config.xml

full-import
idle


0:0:4.231
0
1
0
0
2012-05-11 18:43:30
Indexing failed. Rolled back all changes.
2012-05-11 18:43:30This response format is experimental.  It is likely to change
in the future.


*The log file:*

org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {deleteByQuery=*:*} 0 4
11 Μαϊ 2012 6:55:28 μμ org.apache.solr.common.SolrException log
SEVERE: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
load EntityProcessor implementation for entity:tika Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
load EntityProcessor implementation for entity:tika Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to load EntityProcessor implementation for entity:tika Processing
Document # 1
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:915)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:635)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
... 5 more
Caused by: java.lang.ClassNotFoundException: Unable to load
TikaEntityProcessor or
org.apache.solr.handler.dataimport.TikaEntityProcessor
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:1110)
at
org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:912)
... 8 more
Caused by: org.apache.solr.common.SolrException: Error loading class
'TikaEntityProcessor'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394)
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:1100)
... 9 more
Caused by: java.lang.ClassNotFoundException: TikaEntityProcessor
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:378)
... 10 more

*The data-config.xml:*


















*The solrconfig.xml:*






${solr.abortOnConfigurationError:true}


  LUCENE_36





  
  

  
  

  
  

  
  

  
  

  



  guration.
-->
  ${solr.data.dir:}



  


  

  



  



  


  




  

1024












true


   20


   200



  

  


  

  static firstSearcher warming in solrconfig.xml

  



false


2

  



  






  



  

data-config.xml

  



  

 
   explicit
   100
   biog
 

  


  
 
   explicit


   velocity

   browse
   layout
   Solritas

   text
   edismax
   *:*
   10
   *,score
   
 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
   
   text,features,name,sku,id,manu,cat
   3

   
  text^0.5 features^1.0 name^1.2 sku^1.5 id^10.

Re: Populating 'multivalue' fields (m:1 relationships)

2012-05-11 Thread Mike Sokolov
You can specify a solr field as "multi-valued", and then supply multiple 
values for it.  What that really does is concatenate all the values with 
a positional gap between them to prevent phrases and other positional 
queries from traversing the boundary between the distinct values.


-Mike

On 05/10/2012 12:22 PM, Klostermeyer, Michael wrote:

I am attempting to index a DB schema that has a many:one relationship.  I 
assume I would index this within Solr as a 'multivalue=true' field, is that 
correct?

I am currently populating the Solr index w/ a stored procedure in which each DB record is 
"flattened" into a single document in Solr.  I would like one of those Solr document 
fields to contain multiple values from the m:1 table (i.e. [fieldName]=1,3,6,8,7).  I then need to 
be able to do a "fq=fieldname:3" and return the previous record.

My question is: how do I populate Solr with a multi-valued field for many:1 
relationships?  My first guess would be to concatenate all the values from the 
'many' side into a single DB column in the SP, then pipe that column into a 
multivalue=true Solr field.  The DB side of that will be ugly, but would the 
Solr side index this properly?  If so, what would be the delimiter that would 
allow Solr to index each element of the multivalued field?

[Warning: possible tangent below...but I think this question is relevant.  If 
not, tell me and I'll break it out]

I have gone out of my way to "flatten" the data within my SP prior to giving it to Solr.  
For my solution stated above, I would have the following data (Title being the "many" 
side of the m:1, and PK being the Solr unique ID):

PK | Name | Title
Pk_1 | Dwight | Sales, Assistant To The Regional Manager
Pk_2 | Jim | Sales
Pk_3 | Michael | Regional Manger

Below is an example of a non-flattened record set.  How would Solr handle a 
data set in which the following data was indexed:

PK | Name | Title
Pk_1 | Dwight | Sales
Pk_1 | Dwight | Assistant To The Regional Manager
Pk_2 | Jim | Sales
Pk_3 | Michael | Regional Manger

My assumption is that the second Pk_1 record would overwrite the first, thereby losing 
the "Sales" title from Pk_1.  Am I correct on that assumption?

I'm new to this ballgame, so don't be shy about pointing me down a different 
path if I am doing anything incorrectly.

Thanks!

Mike Klostermeyer

   


Re: Indexing data from pdf

2012-05-11 Thread anarchos78
Now I am getting the following:
*From Solr:*



0
1


data-config.xml

full-import
idle


0:0:4.231
0
1
0
0
2012-05-11 18:43:30
Indexing failed. Rolled back all changes.
2012-05-11 18:43:30This response format is experimental.  It is likely to change
in the future.


*The log file:*

org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {deleteByQuery=*:*} 0 4
11 Μαϊ 2012 6:55:28 μμ org.apache.solr.common.SolrException log
SEVERE: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
load EntityProcessor implementation for entity:tika Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
load EntityProcessor implementation for entity:tika Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to load EntityProcessor implementation for entity:tika Processing
Document # 1
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:915)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:635)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
... 5 more
Caused by: java.lang.ClassNotFoundException: Unable to load
TikaEntityProcessor or
org.apache.solr.handler.dataimport.TikaEntityProcessor
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:1110)
at
org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:912)
... 8 more
Caused by: org.apache.solr.common.SolrException: Error loading class
'TikaEntityProcessor'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394)
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:1100)
... 9 more
Caused by: java.lang.ClassNotFoundException: TikaEntityProcessor
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:378)
... 10 more

*The data-config.xml:*


















*The solrconfig.xml:*




  
 
${solr.abortOnConfigurationError:true}
  
 
  LUCENE_36

  
  
  
  
  
  

  
  

  
  
  
  
  

  
  
  
  

  

  guration.
-->
  ${solr.data.dir:}


  
  

 
  

  


  
  
  

  
  

   
  
  
  

  
  
   
1024





   

   



  
true

   
   20

   
   200

  

  

  


  

  static firstSearcher warming in solrconfig.xml

  


   
false


2

  


  
  



   

   
  

  

  

data-config.xml

  


  
  

 
   explicit
   100
   biog
 
   
  

  
  
 
   explicit

   
   velocity

   browse
   layout
   Solritas

   text
   edismax
   *:*
   10
   *,score
   
 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
   
   text,features,name,sku,id,manu,cat
   3

   
  text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
   

   on
   cat
   manu_exact
   ipod
   GB
   1
   cat,inStock
   after
   price
   0
   600
   50
   popularity
   0
   10
  

Re: And results before Or results

2012-05-11 Thread Karthick Duraisamy Soundararaj
I want to have a strick enforcement that In case of a 3 word search, those
results that match all 3 term should be presented ahead of those that match
2 terms when I set mm=2.

I have seen quite some cases where, those results that match 2 out of 3
words appear ahead of those matching all 3 words.

On Fri, May 11, 2012 at 11:10 AM, Jack Krupansky wrote:

> Strict enforcement? Of what? Your query rule seems rather loose, and
> compatible with simple OR of the terms.
>
>
> -- Jack Krupansky
>
> -Original Message- From: Karthick Duraisamy Soundararaj
> Sent: Friday, May 11, 2012 11:03 AM
> To: solr-user@lucene.apache.org
> Subject: Re: And results before Or results
>
>
> Sure but it doesnt seem to be doing a strict enforcement.
>
> On Fri, May 11, 2012 at 10:56 AM, Jack Krupansky 
> **wrote:
>
>  If you simply "OR" the terms (or specify no operator and make sure that
>> the default operator is "OR"), normal query scoring will rank results with
>> more terms matching higher.
>>
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Karthick Duraisamy Soundararaj
>> Sent: Friday, May 11, 2012 10:44 AM
>> To: solr-user@lucene.apache.org
>> Subject: And results before Or results
>>
>> Lets say I have a query like "A B C". I want all the results that have "A
>> B
>> and C" in them ahead of "A B" or "B C" or any combination of them.
>>
>> My rule is this:
>>   "If there are there words A,B,C : Results of all three words
>> first, followed by 2 out of 3 words and then 1 out of 3 words."
>>
>> Is that possible at all?
>>
>>
>


Re: Question about cache

2012-05-11 Thread Anderson vasconcelos
HI  Kuli

The free -m command gives me
   total   used   free sharedbuffers
cached
Mem:  9991   9934 57  0 75   5759
-/+ buffers/cache:   4099   5892
Swap: 8189   3395   4793

You can see that has only 57m free and 5GB cached.

In top command, the glassfish process used 79,7% of memory:

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
COMMAND
 4336 root  21   0 29.7g 7.8g 4.0g S 0.3  79.7   5349:14
java


If i increase the memory of server for more 2GB, the SO will be use this
additional 2GB in cache? I need to increse the memory size?

Thanks





2012/5/11 Michael Kuhlmann 

> Am 11.05.2012 15:48, schrieb Anderson vasconcelos:
>
>  Hi
>>
>> Analysing the solr server in glassfish with Jconsole, the Heap Memory
>> Usage
>> don't use more than 4 GB. But, when was executed the TOP comand, the free
>> memory in Operating system is only 200 MB. The physical memory is only
>> 10GB.
>>
>> Why machine used so much memory? The cache fields are included in Heap
>> Memory usage? The other 5,8 GB is the caching of Operating System for
>> recent open files? Exists some way to tunning this?
>>
>> Thanks
>>
>>  If the OS is Linux or some other Unix variant, it keeps as much disk
> content in memory as possible. Whenever new memory is needed, it
> automatically gets freed. That won't need time, and there's no need to tune
> anything.
>
> Don't look at the free memory in top command, it's nearly useless. Have a
> look at how much memory your Glassfish process is consuming, and use the
> 'free' command (maybe together with the -m parameter for human readability)
> to find out more about your free memory. The "
> -/+ buffers/cache" line is relevant.
>
> Greetings,
> Kuli
>


Solr 3.6 fails when using XSLT

2012-05-11 Thread pramila_tha...@ontla.ola.org
Hi Everyone,

I have recently upgraded to *solr 3.6 from solr 1.4.*
My XSL where working fine in solr 1.4.

but now with Solr 3.6 I keep getting the following Error 

/getTransformer fails in getContentType java.lang.RuntimeException:
getTransformer fails in getContentType /

But instead of results.xsl If I use example.xsl, it is fine.

I fine my xsl:include does not seem to work with Solr 3.6

Can someone please let me know what am I doing wrong?

Thanks,

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-6-fails-when-using-XSLT-tp3980240.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: And results before Or results

2012-05-11 Thread Jack Krupansky
Strict enforcement? Of what? Your query rule seems rather loose, and 
compatible with simple OR of the terms.


-- Jack Krupansky

-Original Message- 
From: Karthick Duraisamy Soundararaj

Sent: Friday, May 11, 2012 11:03 AM
To: solr-user@lucene.apache.org
Subject: Re: And results before Or results

Sure but it doesnt seem to be doing a strict enforcement.

On Fri, May 11, 2012 at 10:56 AM, Jack Krupansky 
wrote:



If you simply "OR" the terms (or specify no operator and make sure that
the default operator is "OR"), normal query scoring will rank results with
more terms matching higher.


-- Jack Krupansky

-Original Message- From: Karthick Duraisamy Soundararaj
Sent: Friday, May 11, 2012 10:44 AM
To: solr-user@lucene.apache.org
Subject: And results before Or results

Lets say I have a query like "A B C". I want all the results that have "A 
B

and C" in them ahead of "A B" or "B C" or any combination of them.

My rule is this:
   "If there are there words A,B,C : Results of all three words
first, followed by 2 out of 3 words and then 1 out of 3 words."

Is that possible at all?





Re: And results before Or results

2012-05-11 Thread Karthick Duraisamy Soundararaj
Sure but it doesnt seem to be doing a strict enforcement.

On Fri, May 11, 2012 at 10:56 AM, Jack Krupansky wrote:

> If you simply "OR" the terms (or specify no operator and make sure that
> the default operator is "OR"), normal query scoring will rank results with
> more terms matching higher.
>
>
> -- Jack Krupansky
>
> -Original Message- From: Karthick Duraisamy Soundararaj
> Sent: Friday, May 11, 2012 10:44 AM
> To: solr-user@lucene.apache.org
> Subject: And results before Or results
>
> Lets say I have a query like "A B C". I want all the results that have "A B
> and C" in them ahead of "A B" or "B C" or any combination of them.
>
> My rule is this:
>"If there are there words A,B,C : Results of all three words
> first, followed by 2 out of 3 words and then 1 out of 3 words."
>
> Is that possible at all?
>


Re: how to use multiple query operators?

2012-05-11 Thread Jack Krupansky
Please clarify the question. You certainly can write queries as you have 
suggested, at least using the lucene/solr and edismax query parsers), so 
what is the problem or issue or concern that you have? The Dismax query 
parser doesn't support field specification in the query (only in the qf 
parameter for default fields to search) - is that the issue? So, try edismax 
and tell us if you still have a problem there.


-- Jack Krupansky

-Original Message- 
From: G.Long

Sent: Friday, May 11, 2012 10:42 AM
To: solr-user@lucene.apache.org
Subject: how to use multiple query operators?

Hi :)

I'm can't find how to write a query like :

field1:value1 AND (field2:value2 OR field2:value3).

I read the documentation about local parameters which allows to define
the query operator but it seems to be for the entire query.

Gary 



Re: {!term f)xy OR device:0 in fq has strange results

2012-05-11 Thread abhayd
reformatted the same 

hi

I am having some issues in using {!term} in fq with OR 

Following query returns 6 results and it is working as expected
q=navigation&fq={!term f=model}Vivid(PH39100)
And debug out put is also as expected
Debug:
"QParser":"LuceneQParser",
"filter_queries":["{!term f=model}Vivid(PH39100)"],
"parsed_filter_queries":["model:Vivid(PH39100)"],



Now I want to add OR to  fq and it is not working as expected at all
q=navigation&fq=device:0 OR {!term f=model}Vivid(PH39100) 
This is returning only 3 results 

I dont understand parsed_filter_queries output here why its doing
+text:vivid
Debug:
"QParser":"LuceneQParser",
"filter_queries":["device:0 OR {!term f=model}Vivid(PH39100)"],
"parsed_filter_queries":["device:0 text:{!term TO f=model} +text:vivid
+MultiPhraseQuery(text:\"ph (39100 ph39100)\")"],
 

How do i fix this issue?

thanks
abhay

--
View this message in context: 
http://lucene.472066.n3.nabble.com/term-f-xy-OR-device-0-in-fq-has-strange-results-tp3980152p3980156.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Editing long Solr URLs - Chrome Extension

2012-05-11 Thread Jan Høydahl
I've been testing 
https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=en
 but I don't think it's great.

Great work on this one. Simple and straight forward. A few wishes:
* Sticky mode? This tool would make sense in a sidebar, to do rapid refinements
* If you edit a value and click "TAB", it is not updated :(
* It should not be necessary to URLencode all non-ascii chars - why not leave 
colon, caret (^) etc as is, for better readability?
* Some param values in Solr may be large, such as "fl", "qf" or "bf". Would be 
nice if the edit box was multi-line, or perhaps adjusts to the size of the 
content

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 11. mai 2012, at 07:32, Amit Nithian wrote:

> Hey all,
> 
> I don't know about you but most of the Solr URLs I issue are fairly
> lengthy full of parameters on the query string and browser location
> bars aren't long enough/have multi-line capabilities. I tried to find
> something that does this but couldn't so I wrote a chrome extension to
> help.
> 
> Please check out my blog post on the subject and please let me know if
> something doesn't work or needs improvement. Of course this can work
> for any URL with a query string but my motivation was to help edit my
> long Solr URLs.
> 
> http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
> 
> Thanks!
> Amit



Re: And results before Or results

2012-05-11 Thread Jack Krupansky
If you simply "OR" the terms (or specify no operator and make sure that the 
default operator is "OR"), normal query scoring will rank results with more 
terms matching higher.


-- Jack Krupansky

-Original Message- 
From: Karthick Duraisamy Soundararaj

Sent: Friday, May 11, 2012 10:44 AM
To: solr-user@lucene.apache.org
Subject: And results before Or results

Lets say I have a query like "A B C". I want all the results that have "A B
and C" in them ahead of "A B" or "B C" or any combination of them.

My rule is this:
"If there are there words A,B,C : Results of all three words
first, followed by 2 out of 3 words and then 1 out of 3 words."

Is that possible at all? 



{!term f)xy OR device:0 in fq has strange results

2012-05-11 Thread abhayd
hi

I am having some issues in using {!term} in fq with OR 

Following query returns 6 results and it is working as expected
q=navigation&fq={!term f=model}Vivid(PH39100)
And debug out put is also as expected
Debug:
"QParser":"LuceneQParser",
"filter_queries":["{!term f=model}Vivid(PH39100)"],
"parsed_filter_queries":["model:Vivid(PH39100)"],



Now I want to add OR to  fq and it is not working as expected at all
q=navigation&fq=device:0 OR {!term f=model}Vivid(PH39100) 
This is returning only 3 results 

I dont understand parsed_filter_queries output here why its doing
+text:vivid
Debug:
"QParser":"LuceneQParser",
"filter_queries":["device:0 OR {!term f=model}Vivid(PH39100)"],
"parsed_filter_queries":["device:0 text:{!term TO f=model} +text:vivid
+MultiPhraseQuery(text:\"ph (39100 ph39100)\")"],
 

How do i fix this issue?

thanks
abhay

--
View this message in context: 
http://lucene.472066.n3.nabble.com/term-f-xy-OR-device-0-in-fq-has-strange-results-tp3980152.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: And results before Or results

2012-05-11 Thread Jack Krupansky
With the edismax query parser you can specify "phrase boosting" using the 
pf, pf2, and pf3 (and ps, ps2, ps3) request parameters, and you can set the 
boost factor for each.


pf, pf2, and pf3 have the same format as qf.

See:
http://wiki.apache.org/solr/ExtendedDisMax

You can also simulate that in the Lucene/SOlr query parser by "OR"ing in the 
phrases with boost factors.


-- Jack Krupansky

-Original Message- 
From: Karthick Duraisamy Soundararaj

Sent: Friday, May 11, 2012 10:44 AM
To: solr-user@lucene.apache.org
Subject: And results before Or results

Lets say I have a query like "A B C". I want all the results that have "A B
and C" in them ahead of "A B" or "B C" or any combination of them.

My rule is this:
"If there are there words A,B,C : Results of all three words
first, followed by 2 out of 3 words and then 1 out of 3 words."

Is that possible at all? 



Re: solr.WordDelimiterFilterFactory query time

2012-05-11 Thread abhayd
hi jack,

It worked with dismax. I was using a our search partner provided wrapper
around dismax and it seems like it has a bug.

I switched to dismax and all is working fine now.

Thanks for help

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-WordDelimiterFilterFactory-query-time-tp3950045p3980123.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to use multiple query operators?

2012-05-11 Thread G.Long

Hi :)

I'm can't find how to write a query like :

field1:value1 AND (field2:value2 OR field2:value3).

I read the documentation about local parameters which allows to define 
the query operator but it seems to be for the entire query.


Gary



RE: SOLR Security

2012-05-11 Thread Welty, Richard
in fact, there's a sample proxy.php on the ajax-solr web page which can easily 
be modified into a security layer. my solr servers only listen to requests 
issued by a narrow list of systems, and everything gets routed through a 
modified copy of the proxy.php file, which checks whether the user is logged 
in, and adds terms to the query to limit returned results to those the user is 
permitted to see.


-Original Message-
From: Jan Høydahl [mailto:j...@hoydahl.no]
Sent: Fri 5/11/2012 9:45 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR Security
 
Hi,

There is nothing stopping you from pointing Ajax-SOLR to a URL on your 
app-server, which acts as a security insulation layer between the Solr backend 
and the world. In this (thin) layer you can analyze the input and choose 
carefully what to let through and not.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:

> Yes, I agree with you.
> 
> But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
> solution ?
> 
> Anupam
> 
> On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
> mklosterme...@riskexchange.com> wrote:
> 
>> Instead of hitting the Solr server directly from the client, I think I
>> would go through your application server, which would have access to all
>> the users data and can forward that to the Solr server, thereby hiding it
>> from the client.
>> 
>> Mike
>> 
>> 
>> -Original Message-
>> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
>> Sent: Thursday, May 10, 2012 9:53 PM
>> To: solr-user@lucene.apache.org
>> Subject: SOLR Security
>> 
>> I am using Ajax-Solr Framework for creating a search interface. The search
>> interface works well.
>> In my case, the results have document level security so by even indexing
>> records with there authorized users help me to filter results per user
>> based on the authentication of the user.
>> 
>> The problem that I have to a pass always a parameter to the SOLR Server
>> with userid={xyz} which one can figure out from the SOLR URL(ajax call url)
>> using Firebug tool in the Net Console on Firefox and can change this
>> parameter value to see others records which he/she is not authorized.
>> Basically it is Cross Site Scripting Issue.
>> 
>> I have read about some approaches for Solr Security like Nginx with Jetty
>> & .htaccess based security.Overall what i understand from this is that we
>> can restrict users to do update/delete operations on SOLR as well as we can
>> restrict the SOLR admin interface to certain IPs also. But How can I
>> restrict the {solr-server}/solr/select based results from access by
>> different user id's ?
>> 





Re: Question about cache

2012-05-11 Thread Michael Kuhlmann

Am 11.05.2012 15:48, schrieb Anderson vasconcelos:

Hi

Analysing the solr server in glassfish with Jconsole, the Heap Memory Usage
don't use more than 4 GB. But, when was executed the TOP comand, the free
memory in Operating system is only 200 MB. The physical memory is only 10GB.

Why machine used so much memory? The cache fields are included in Heap
Memory usage? The other 5,8 GB is the caching of Operating System for
recent open files? Exists some way to tunning this?

Thanks

If the OS is Linux or some other Unix variant, it keeps as much disk 
content in memory as possible. Whenever new memory is needed, it 
automatically gets freed. That won't need time, and there's no need to 
tune anything.


Don't look at the free memory in top command, it's nearly useless. Have 
a look at how much memory your Glassfish process is consuming, and use 
the 'free' command (maybe together with the -m parameter for human 
readability) to find out more about your free memory. The "

-/+ buffers/cache" line is relevant.

Greetings,
Kuli


How detect slave replication termination

2012-05-11 Thread Jamel ESSOUSSI
Hi,

I have an indexer that indexes solr documents, at the end of the indexing I
will initiate replication by activating it on the master and on all slaves,
my question is : how I will know when the replication between the master and
the slave1 will be ended to replicate with the slave2.

Best Regards

--Jamel ESSOUSSI


--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-detect-slave-replication-termination-tp3979991.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Slow indexing in solr 3.6

2012-05-11 Thread not interesting
Are you using DIH and CachedSqlEntityProcessor? I have a similar
issue; the 3.6.1 jars of DIH might help you, see:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg65912.html

Kellen


Re: SOLR Security

2012-05-11 Thread Jan Høydahl
Hi,

There is nothing stopping you from pointing Ajax-SOLR to a URL on your 
app-server, which acts as a security insulation layer between the Solr backend 
and the world. In this (thin) layer you can analyze the input and choose 
carefully what to let through and not.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:

> Yes, I agree with you.
> 
> But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
> solution ?
> 
> Anupam
> 
> On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
> mklosterme...@riskexchange.com> wrote:
> 
>> Instead of hitting the Solr server directly from the client, I think I
>> would go through your application server, which would have access to all
>> the users data and can forward that to the Solr server, thereby hiding it
>> from the client.
>> 
>> Mike
>> 
>> 
>> -Original Message-
>> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
>> Sent: Thursday, May 10, 2012 9:53 PM
>> To: solr-user@lucene.apache.org
>> Subject: SOLR Security
>> 
>> I am using Ajax-Solr Framework for creating a search interface. The search
>> interface works well.
>> In my case, the results have document level security so by even indexing
>> records with there authorized users help me to filter results per user
>> based on the authentication of the user.
>> 
>> The problem that I have to a pass always a parameter to the SOLR Server
>> with userid={xyz} which one can figure out from the SOLR URL(ajax call url)
>> using Firebug tool in the Net Console on Firefox and can change this
>> parameter value to see others records which he/she is not authorized.
>> Basically it is Cross Site Scripting Issue.
>> 
>> I have read about some approaches for Solr Security like Nginx with Jetty
>> & .htaccess based security.Overall what i understand from this is that we
>> can restrict users to do update/delete operations on SOLR as well as we can
>> restrict the SOLR admin interface to certain IPs also. But How can I
>> restrict the {solr-server}/solr/select based results from access by
>> different user id's ?
>> 



Re: Identify indexed terms of document

2012-05-11 Thread Anderson vasconcelos
Thanks

2012/5/11 Michael Kuhlmann 

> Am 10.05.2012 22:27, schrieb Ahmet Arslan:
>
>
>>
>>  It's possible to see what terms are indexed for a field of
>>> document that
>>> stored=false?
>>>
>>
>> One way is to use 
>> http://wiki.apache.org/solr/**LukeRequestHandler
>>
>
> Another approach is this:
>
> - Query for exactly this document, e.g. by using the unique field
> - Add this to your URL parameters:
> &facet=true&facet.field=&facet.mincount=1
>
> -Kuli
>


Re: Indexing data from pdf

2012-05-11 Thread Ahmet Arslan
> org.apache.solr.common.SolrException log
> SEVERE: Full Import failed:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NoClassDefFoundError:
> org/apache/tika/parser/AutoDetectParser


Did you put all of the jars presented in solr/contrib/extraction/lib into 
SolrHome/lib directory?



Slow indexing in solr 3.6

2012-05-11 Thread mechravi25
Hi,
 
I am migrating from solr 1.4 to solr 3.6. I have used the latest 3.6 jars.
 
After indexing for few data, I noticed that the indexing is taking a lot of
time and the statistics are shown below
 
 
 
 

  1737 
  1133174 
  0 
  2012-05-11 00:16:03 
  Indexing completed. Added/Updated: 434 documents. Deleted 0
documents. 
  2012-05-11 00:36:20 
  434 
  0:20:16.941 
  
  This response format is experimental. It is likely to
change in the future.
  
  whereas to index the same data in 1.4, it is taking a less time and also
the total request made to datasource is also less in 1.4 compared to 3.6.
  
  Total Requests made to DataSource=5, 
  Total Rows Fetched=3042, 
  Total Documents Skipped=0, 
  Full Dump Started=2012-05-10 01:47:59, 
  Indexing completed. Added/Updated: 434 documents. Deleted 0 documents., 
  Committed=2012-05-10 01:48:34, Optimized=2012-05-10 01:48:34, 
  Total Documents Processed=434, Time taken =0:0:35.760
  
Also, during indexing,  Im getting the following message in the output log
file

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.

But, i ve placed the latest jar file in both server and client. Is this
issue related to slow indexing?

Is indexing slow in 3.6 compared to 1.4. or am i missing anything here? 
 
Please guide me.
 
Note : I am using the same solr config and schema file as in 1.4 for 3.6.
 
Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Slow-indexing-in-solr-3-6-tp3979903.html
Sent from the Solr - User mailing list archive at Nabble.com.


DataImportHandler - Custom EventListener

2012-05-11 Thread andre.schneider
Hi there,
i want to register a custom EventListener to the DataImportHandler, but i
get a NoClassDefFoundError.
My configuration: 
Gentoo Linux.
Solr home is /opt/solr.
The solr.war file is deployed in an existing tomcat at /opt/tomcat/webapps
The solr version is 3.6, tomcat version is 6.0.35, oracle jdk 1.7.0_03.

MyEventListener implements org.apache.solr.handler.dataimport.EventListener. 
It is in a jar file in /opt/solr/myproject/lib. (Here is also the mysql jdbc
driver jar, which can be loaded)

In the solrconfig.xml i added following lib-tags:



and following DataImportHandler:

   
  db-data-config.xml
   


The db-data-config.xml looks like:

   
   
 
   


But when i run the full-import command, solr imports the documents, but when
it tries to load my EventListener then following Exception is thrown:
Sever: Full Import failed:java.lang.NoClassDefFoundError:
org/apache/solr/handler/dataimport/EventListener
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
at java.lang.ClassLoader.loadClass(ClassLoader.java:410)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:378)
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:1100)
at
org.apache.solr.handler.dataimport.DocBuilder.invokeEventListener(DocBuilder.java:158)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:251)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.EventListener
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 23 more

I have also tried to put the solr jars in a shared lib folder of tomcat, but
with the same error.

Can anybody help?

Thanks a lot in advance.

Best regards,
Andre

--
View this message in context: 
http://lucene.472066.n3.nabble.com/DataImportHandler-Custom-EventListener-tp3979799.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Indexing data from database

2012-05-11 Thread anarchos78
Thank you thank you thank you!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-data-from-database-tp3979692p3979778.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Indexing data from database

2012-05-11 Thread Jack Krupansky
Maybe the id values overlap between your database tables. Solr needs unique 
values for the id, so if a document is indexed from a different database 
table but the same id value, Solr will replace the existing document with 
that id. You need to make sure that the id values are unique across all of 
your database tables, such as adding a prefix or suffix that identifies the 
database table the data is coming from.


In other words, maybe all of your "journal" documents got overwritten with 
documents with the same ids from other tables.


-- Jack Krupansky

-Original Message- 
From: anarchos78

Sent: Friday, May 11, 2012 7:47 AM
To: solr-user@lucene.apache.org
Subject: Indexing data from database

Hello friends,
I am trying to index data from database. I am doing that successfully. But I
have a problem. I want to use one index for whole database. All the db
tables have at least 3 columns with the same name (I want to be like this).
For instance I have these tables: members, new_members, books, journals and
cds. All these have columns named: id, model, biog. So, in all the db tables
the id (auto incremented) starts from 1.
When I am querying Solr using filter (fq=model:journal) it is returning
nothing. Querying for books returns a portion of data (I have 5 rows and it
returns 2. I am using *:* in order to retrieve all the rows ).  I know that
the data is in Solr’s “data” file. I think there is a conflict of some kind.
How can I have a single index with all these tables without any conflicts?


*The data-config.xml:*





 

 

   

   
   


   



   
   


   


 

   
   

   


 

   
   

   


 

   
   

   

 


*The schema.xml (fields):*

 
 
 
 
 
 

id
 biog 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-data-from-database-tp3979692.html
Sent from the Solr - User mailing list archive at Nabble.com. 



With solr.MappingCharFilterFactory, highlighting doesn't work with transformed characters

2012-05-11 Thread remus

Hi,

In my schema.xml I have for my text field type:

mapping="mapping-ISOLatin1Accent.txt"/>


(See below for complete fieldType definition.) This correctly transforms 
all accented characters, umlauts, etc. to their "normal" form.
The problem is this: When I search for any word with such a character 
(e.g. "Ärzte" which becomes "Arzte" internally), highlighting doesn't 
work, there are no strings returned. No error message is issued, no 
exceptions occur, as far as I can tell.
If searching e.g. for "?rzte" (without quotes), highlighting works fine 
again when finding "Ärzte". If I comment out the 
solr.MappingCharFilterFactory in the text type, highlighting also works 
perfectly.


The problem exists in all versions I tested, i.e., 1.4, 3.5, 3.6.

Google didn't find anything useful. Does anyone have any clues or 
suggestions here? Any help would be much appreciated!


Cheers,
remus

---
Complete fieldType definition:

stored="true" multiValued="true" positionIncrementGap="100">

  
mapping="mapping-ISOLatin1Accent.txt"/>






  
  
mapping="mapping-ISOLatin1Accent.txt"/>





  



Indexing data from database

2012-05-11 Thread anarchos78
Hello friends,
I am trying to index data from database. I am doing that successfully. But I
have a problem. I want to use one index for whole database. All the db
tables have at least 3 columns with the same name (I want to be like this).
For instance I have these tables: members, new_members, books, journals and
cds. All these have columns named: id, model, biog. So, in all the db tables
the id (auto incremented) starts from 1.
When I am querying Solr using filter (fq=model:journal) it is returning
nothing. Querying for books returns a portion of data (I have 5 rows and it
returns 2. I am using *:* in order to retrieve all the rows ).  I know that
the data is in Solr’s “data” file. I think there is a conflict of some kind.
How can I have a single index with all these tables without any conflicts?


*The data-config.xml:*





  
  
  
  

 







 





  
  
  
 




  
  
  
 




  
  
  
 





  


*The schema.xml (fields):*

   
  
  
  
  
  

id
 biog  


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-data-from-database-tp3979692.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Suddenly OOM

2012-05-11 Thread Jasper Floor
Outr rambuffer is the default. the Xmx is 75% of the available memory
on the machine which is 4GB. We've tried increasing it to 85% and even
gave the machine 10GB of memory. So we more than doubled the memory.
The amount of data wasn't double but where it used to be enough now it
seems to never be enough.

mvg,
Jasper

On Thu, May 10, 2012 at 6:03 PM, Otis Gospodnetic
 wrote:
> Jasper,
>
> The simple answer is to increase -Xmx :)
> What is your ramBufferSizeMB (solrconfig.xml) set to?  Default is 32 (MB).
>
> That autocommit you mentioned is a DB commit?  Not Solr one, right?  If so, 
> why is commit needed when you *read* data from DB?
>
> Otis
> 
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm
>
>
>
> - Original Message -
>> From: Jasper Floor 
>> To: solr-user@lucene.apache.org
>> Cc:
>> Sent: Thursday, May 10, 2012 9:06 AM
>> Subject: Suddenly OOM
>>
>> Hi all,
>>
>> we've been running Solr 1.4 for about a year with no real problems. As
>> of monday it became impossible to do a full import on our master
>> because of an OOM. Now what I think is strange is that even after we
>> more than doubled the available memory there would still always be an
>> OOM.  We seem to have reached a magic number of documents beyond which
>> Solr requires infinite memory (or at least more than 2.5x what it
>> previously needed which is the same as infinite unless we invest in
>> more resources).
>>
>> We have solved the immediate problem by changing autocommit=false,
>> holdability="CLOSE_CURSORS_AT_COMMIT", batchSize=1. Now
>> holdability in this case I don't think does very much as I believe
>> this is the default behavior. BatchSize certainly has a direct effect
>> on performance (about 3x time difference between 1 and 1). The
>> autocommit is a problem for us however. This leaves transactions
>> active in the db which may block other processes.
>>
>> We have about 5.1 million documents in the index which is about 2.2 
>> gigabytes.
>>
>> A full index is a rare operation with us but when we need it we also
>> need it to work (thank you captain obvious).
>>
>> With the settings above a full index takes 15 minutes. We anticipate
>> we will be handling at least 10x the amount of data in the future. I
>> actually hope to have solr 4 by then but I can't sell a product which
>> isn't finalized yet here.
>>
>>
>> Thanks for any insight you can give.
>>
>> mvg,
>> Jasper
>>


Re: slave index not cleaned

2012-05-11 Thread Jasper Floor
Hi,

On Thu, May 10, 2012 at 5:59 PM, Otis Gospodnetic
 wrote:
> Hi Jasper,

Sorry, I should've added more technical info wihtout being prompted.

> Solr does handle that for you.  Some more stuff to share:
>
> * Solr version?

1.4

> * JVM version?
1.7 update 2

> * OS?
Debian (2.6.32-5-xen-amd64)

> * Java replication?
yes

> * Errors in Solr logs?
no

> * deletion policy section in solrconfig.xml?
missing I would say, but I don't see this on the replication wiki page.

This is what we have configured for replication:




${solr.master.url}/df-stream-store/replication

00:20:00
internal
5000
1

 


We will be updating to 3.6 fairly soon however. To be honest, from
what I've read, the Solr cloud is what we really want in the future
but we will have to be patient for that.

thanks in advance

mvg,
Jasper

> You may also want to look at your Index report in SPM 
> (http://sematext.com/spm) before/during/after replication and share what you 
> see.
>
> Otis
> 
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm
>
>
>
> - Original Message -
>> From: Jasper Floor 
>> To: solr-user@lucene.apache.org
>> Cc:
>> Sent: Thursday, May 10, 2012 9:08 AM
>> Subject: slave index not cleaned
>>
>> Perhaps I am missing the obvious but our slaves tend to run out of
>> disk space. The index sizes grow to multiple times the size of the
>> master. So I just toss all the data and trigger a replication.
>> However, can't solr handle this for me?
>>
>> I'm sorry if I've missed a simple setting which does this for me, but
>> if its there then I have missed it.
>>
>> mvg
>> Jasper
>>


Merging two DocSets in solr

2012-05-11 Thread Ramprakash Ramamoorthy
Dear all,

  I get two different DocSets from two different searchers. I need
to merge them into one and get the facet counts from the merged
docSets. How do I do it? Any pointers would be appreciated.

-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Project Trainee,
Zoho Corporation.
+91 9626975420


Lucene FieldCache doesn' get cleaned up and OOM occurs

2012-05-11 Thread Mathias Hodler
Hi,

sorting on a field increases the Lucene FieldCache. If I'm starting 10
queries and each query sorting on a different field, 9 queries could
be executed but then the Lucene FieldCache exceeds max memory and OOM
occurs.
In my opinion Lucene Field Cache should be cleaned up if there is not
enough memory left. But instead of that, Field Cache will always
remains in "Old Generation GC".

Could this be fixed or is the only way out to get more memory?

Thanks.

Mathias


Re: Fwd: Delete documents

2012-05-11 Thread Tolga

That worked, thanks a lot Jack :)

On 5/11/12 7:44 AM, Jack Krupansky wrote:
Try using the actual id of the document rather than the shell 
substitution variable - if you're trying to delete one document.


To delete all documents, use delete by query:

*:*

See:
http://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F 



-- Jack Krupansky

-Original Message- From: Tolga
Sent: Friday, May 11, 2012 12:31 AM
To: solr-user@lucene.apache.org
Subject: Fwd: Delete documents

Anyone at all?

 Original Message 
Subject: Delete documents
Date: Thu, 10 May 2012 22:59:49 +0300
From: Tolga 
To: solr-user@lucene.apache.org



Hi,
I've been reading
http://lucene.apache.org/solr/api/doc-files/tutorial.html and in the
section "Deleting Data", I've edited schema.xml to include a field named
id, issued the command for f in *;java -Ddata=args -Dcommit=yes -jar
post.jar "$f";done, went on to the stats page
only to find no files were de-indexed. How can I do that?

Regards,



Re: Identify indexed terms of document

2012-05-11 Thread Michael Kuhlmann

Am 10.05.2012 22:27, schrieb Ahmet Arslan:




It's possible to see what terms are indexed for a field of
document that
stored=false?


One way is to use http://wiki.apache.org/solr/LukeRequestHandler


Another approach is this:

- Query for exactly this document, e.g. by using the unique field
- Add this to your URL parameters:
&facet=true&facet.field=&facet.mincount=1

-Kuli