Re: SOLR 4.2 SolrQuery exception

2013-03-25 Thread Gopal Patwa
manually delete lock file
/data/solr1/example/solr/collection1/./data/index/write.lock,
And restart solr


On Sun, Mar 24, 2013 at 9:32 PM, Sandeep Kumar Anumalla 
sanuma...@etisalat.ae wrote:

 Hi,

 I managed to resolve this issue and I am getting the results also. But
 this time I am getting a different exception while loading Solr Container

 Here is the Code.

 String SOLR_HOME = /data/solr1/example/solr/collection1;
 CoreContainer coreContainer = new CoreContainer(SOLR_HOME);
 CoreDescriptor discriptor = new CoreDescriptor(coreContainer,
 collection1, new File(SOLR_HOME).getAbsolutePath());
 SolrCore solrCore = coreContainer.create(discriptor);
 coreContainer.register(solrCore, false);
 File home = new File( SOLR_HOME );
 File f = new File( home, solr.xml );
 coreContainer.load( SOLR_HOME, f );
 server = new EmbeddedSolrServer( coreContainer, collection1 );
 SolrQuery q = new SolrQuery();


 Parameters inside Solrconfig.xml
 !-- writeLockTimeout1000/writeLockTimeout  --
 lockTypesimple/lockType
 unlockOnStartuptrue/unlockOnStartup


 WARNING: Unable to get IndexCommit on startup
 org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
 SimpleFSLock@/data/solr1/example/solr/collection1/./data/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:636)
at
 org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77)
at
 org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at
 org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:192)
at
 org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:106)
at
 org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:904)
at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:592)
at org.apache.solr.core.SolrCore.init(SolrCore.java:801)
at org.apache.solr.core.SolrCore.init(SolrCore.java:619)
at
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1021)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)



 From: Sandeep Kumar Anumalla
 Sent: 24 March, 2013 03:44 PM
 To: solr-user@lucene.apache.org
 Subject: SOLR 4.2 SolrQuery exception

 I am using the below code and getting the exception while using SolrQuery



 Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener
 newSearcher
 INFO: QuerySenderListener sending requests to 
 Searcher@795e0c2bmain{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313
 _4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793
 _50(4.2):C3554/761 _51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215
 _54(4.2):C1808/1593 _55(4.2):C1593)}
 Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.NullPointerException
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
 at
 org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
 at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1586)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)

 Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute
 INFO: [collection1] webapp=null path=null
 params={event=firstSearcherq=static+firstSearcher+warming+in+solrconfig.xmldistrib=false}
 status=500 QTime=4
 Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener
 newSearcher
 INFO: QuerySenderListener done.
 Mar 24, 2013 3:08:07 PM
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 newSearcher
 INFO: Loading spell 

Re: OutOfMemoryError

2013-03-25 Thread Arkadi Colson
I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m as 
parameters. I also added -XX:+UseG1GC to the java process. But now the 
whole machine crashes! Any idea why?


Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked 
oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ 
mems_allowed=0
Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: java 
Not tainted 2.6.32-5-amd64 #1

Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [810b6324] ? 
oom_kill_process+0x7f/0x23f
Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [810b6848] ? 
__out_of_memory+0x12a/0x141
Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [810b699f] ? 
out_of_memory+0x140/0x172
Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [810ba704] ? 
__alloc_pages_nodemask+0x4ec/0x5fc
Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [812fb47a] ? 
io_schedule+0x93/0xb7
Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [810bbc69] ? 
__do_page_cache_readahead+0x9b/0x1b4
Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [81064fc0] ? 
wake_bit_function+0x0/0x23
Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [810bbd9e] ? 
ra_submit+0x1c/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [810b4a72] ? 
filemap_fault+0x17d/0x2f6
Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [810ca9e2] ? 
__do_fault+0x54/0x3c3
Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [810ccd36] ? 
handle_mm_fault+0x3b8/0x80f
Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [8101166e] ? 
apic_timer_interrupt+0xe/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [812febf6] ? 
do_page_fault+0x2e0/0x2fc
Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [812fca95] ? 
page_fault+0x25/0x30

Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0, 
btch:   1 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0, 
btch:   1 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0, 
btch:   1 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0, 
btch:   1 usd:   0

Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186, 
btch:  31 usd:   0

Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186, 
btch:  31 usd:  17
Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186, 
btch:  31 usd:   2
Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
inactive_anon:388557 isolated_anon:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209]  active_file:68 
inactive_file:236 isolated_file:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080210]  unevictable:0 dirty:5 
writeback:5 unstable:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080211]  free:16573 
slab_reclaimable:2398 slab_unreclaimable:2335
Mar 22 20:30:01 solr01-gs kernel: [716098.080212]  mapped:36 shmem:0 
pagetables:24750 bounce:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080575] Node 0 DMA 
free:15796kB min:16kB low:20kB high:24kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:15244kB mlocked:0kB 
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB 
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 22 20:30:01 solr01-gs kernel: [716098.081041] lowmem_reserve[]: 0 
3000 12090 12090
Mar 22 20:30:01 solr01-gs kernel: [716098.081110] Node 0 DMA32 
free:39824kB min:3488kB low:4360kB high:5232kB active_anon:2285240kB 
inactive_anon:520624kB active_file:0kB inactive_file:188kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:4152kB slab_unreclaimable:1640kB kernel_stack:1104kB 
pagetables:31100kB unstable:0kB bounce:0kB writeback_tmp:0kB 
pages_scanned:89 all_unreclaimable? no
Mar 22 20:30:01 solr01-gs kernel: [716098.081600] lowmem_reserve[]: 0 0 
9090 9090
Mar 22 20:30:01 solr01-gs kernel: [716098.081664] Node 0 Normal 
free:10672kB min:10572kB 

SOLR - Unable to execute query error - DIH

2013-03-25 Thread kobe.free.wo...@gmail.com
Hello All,

I am trying to index data from SQL Server view to the SOLR using the DIH
with full-import command. The view has 750K rows and 427 columns. During the
first execution i indexed only the first 50 rows of the view, the data got
indexed in 10 min. But, when i executed the same scenario to index the
complete set of 750K rows, the execution continued for 2 days and
roll-backed, giving me the following error:

Unable to execute the query: select * from.

Following is my DIH configuration file,

dataConfig
  dataSource type=JdbcDataSource
driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
url=jdbc:sqlserver://server1\sql2012;databaseName=DBName user=x
password=x /
  document name=Search batchsize=1
entity name=Search query=select top 500 * from view
   field column=ID name=Id /

As suggested in some of the posts, i did try with batchsize=-1, but dint
work out. Please suggest is this the correct approach or any parameter needs
to be modified for tuning.

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR - Unable to execute query error - DIH

2013-03-25 Thread kobe.free.wo...@gmail.com
In context of the above scenario, when i try to index set of 500 rows, it
fetches and indexes around 400 odd rows and then it shows no progress and
keeps on executing. What can be the possible cause of this issue? If
possible, please do share if you guys have gone through such scenario with
the respective details.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028p4051034.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [ANNOUNCE] Solr wiki editing change

2013-03-25 Thread Andrzej Bialecki

On 3/25/13 4:18 AM, Steve Rowe wrote:

The wiki at http://wiki.apache.org/solr/ has come under attack by spammers more 
frequently of late, so the PMC has decided to lock it down in an attempt to 
reduce the work involved in tracking and removing spam.

 From now on, only people who appear on 
http://wiki.apache.org/solr/ContributorsGroup will be able to 
create/modify/delete wiki pages.

Please request either on the solr-user@lucene.apache.org or on 
d...@lucene.apache.org to have your wiki username added to the 
ContributorsGroup page - this is a one-time step.


Please add AndrzejBialecki to this group. Thank you!

--
Best regards,
Andrzej Bialecki
http://www.sigram.com, blog http://www.sigram.com/blog
 ___.,___,___,___,_._. __
[___||.__|__/|__||\/|: Information Retrieval, System Integration
___|||__||..\|..||..|: Contact: info at sigram dot com



storing key value pair in multivalued field solr4.0

2013-03-25 Thread Karunakar Reddy
Hi ,
i am using solr4.0.i want to store key value pairs of attributes in
mutlivalued field.

Example i have some documents (Products) which have attributes as one field
and i indexed
attributes as separate documents to power auto suggest . now in some auto
suggest i have to show facet count of products also . for this i am using
solr joins 4.0 and faceting on attributes. here i want to get the name and
id of attributes. how i can achieve this?

The Query is looks like below

localhost:8980/solr/searchapp/select?q=%7B!join+from=attr_id+to=prod_attr_id%7Dterms:redwt=jsonindent=truefacet.field=prod_attr_idfacet=truerows=1000fl=product_name,product_id


Thanks in advance !


Re: Very slow query when boosting involve with EnternalFileField

2013-03-25 Thread Mikhail Khludnev
Floyd,

I think you need provide stack trace or draft sampling.


On Fri, Mar 22, 2013 at 6:23 AM, Floyd Wu floyd...@gmail.com wrote:

 Anybody can point me a direction?
 Many thanks.



 2013/3/20 Floyd Wu floyd...@gmail.com

  Hi everyone,
 
  I have a problem and have no luck to figure out.
 
  When I issue a query to
  Query 1
 
 
 http://localhost:8983/solr/select?q={!boost+b=recip(ms(NOW/HOUR,last_modified_datetime),3.16e-11,1,1)}all
 
 http://localhost:8983/solr/select?q=%7B!boost+b=recip(ms(NOW/HOUR,last_modified_datetime),3.16e-11,1,1)%7Dall
 
  :javastart=0rows=10fl=score,authorsort=score+desc
 
  Query 2
 
 
 http://localhost:8983/solr/select?q={!boost+b=sum(ranking,recip(ms(NOW/HOUR,last_modified_datetime)),3.16e-11,1,1)}all
 
 http://localhost:8983/solr/select?q=%7B!boost+b=sum(ranking,recip(ms(NOW/HOUR,last_modified_datetime)),3.16e-11,1,1)%7Dall
 
  :javastart=0rows=10fl=score,authorsort=score+desc
 
  The difference between two query is boost.
  The boost function of Query 2 using a field named ranking and this field
  is ExternalFileField.
  External file is key=value pair about 1 lines.
 
  Execution time
  Query 1--100ms
  Query 2--2300ms
 
  I tried to issue Query 3 and change ranking to a constant 1
 
 
 http://localhost:8983/solr/select?q={!boost+b=sum(1,recip(ms(NOW/HOUR,last_modified_datetime)),3.16e-11,1,1)}all
 
 http://localhost:8983/solr/select?q=%7B!boost+b=sum(1,recip(ms(NOW/HOUR,last_modified_datetime)),3.16e-11,1,1)%7Dall
 
  :javastart=0rows=10fl=score,authorsort=score+desc
 
  Execution time
  Query 3--110ms
 
  one thing I can sure that involved with externalFileField will slow down
  query execution time significantly. But I have no idea how to solve this
  problem as my boost function must calculate value of ranking field.
 
  Please help on this.
 
  PS: I'm using SOLR-4.1
 
  Floyd
 
 
 
 




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Undefined field problem.

2013-03-25 Thread Mid Night
Hi,


I recently added a new field (toptipp) to an existing solr schema.xml and
it worked just fine.  Subsequently I added to more fields (active_cruises
and non_grata) to the schema and now I get this error:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status400/intint
name=QTime6/int/lstlst name=errorstr name=msgundefined
field: active_cruise/strint name=code400/int/lst
/response


My solr db is populated via a program that creates and uploads a csv file.
When I view the csv file, the field active_cruises (given as undefined
above), is populated correctly.  As far as I can tell, when I added the
final fields to the schema, I did exactly the same as when I added
toptipp.  I updated schema.xml and restarted solr (java -jar start.jar).

I am really at a loss here.  Can someone please help with the answer or by
pointing me in the right direction?  Naturally I'd be happy to provide
further info if needed.


Thanks
MK


Re: Undefined field problem.

2013-03-25 Thread Mid Night
Further to the prev msg:  Here's an extract from my current schema.xml:

   field name=show_en type=boolean indexed=true stored=false
required=true /
   field name=active_cruise type=boolean indexed=true stored=true/
   field name=non_grata type=boolean indexed=true stored=true/
   field name=toptipp type=int indexed=true stored=true/



The original schema.xml had the last 3 fields in the order toptipp,
active_cruise and non_grata.  Active_cruise and non_grata were also defined
as type=int.  I changed the order and field types in my attempts to fix
the error.





On 25 March 2013 11:21, Mid Night mid...@gmail.com wrote:

 Hi,


 I recently added a new field (toptipp) to an existing solr schema.xml and
 it worked just fine.  Subsequently I added to more fields (active_cruises
 and non_grata) to the schema and now I get this error:

 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint 
 name=QTime6/int/lstlst name=errorstr name=msgundefined field: 
 active_cruise/strint name=code400/int/lst
 /response


 My solr db is populated via a program that creates and uploads a csv
 file.  When I view the csv file, the field active_cruises (given as
 undefined above), is populated correctly.  As far as I can tell, when I
 added the final fields to the schema, I did exactly the same as when I
 added toptipp.  I updated schema.xml and restarted solr (java -jar
 start.jar).

 I am really at a loss here.  Can someone please help with the answer or by
 pointing me in the right direction?  Naturally I'd be happy to provide
 further info if needed.


 Thanks
 MK










Re: OutOfMemoryError

2013-03-25 Thread Arkadi Colson
Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7? 
Any extra options needed?


Thanks...

On 03/25/2013 08:34 AM, Arkadi Colson wrote:
I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m 
as parameters. I also added -XX:+UseG1GC to the java process. But now 
the whole machine crashes! Any idea why?


Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked 
oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ 
mems_allowed=0
Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: 
java Not tainted 2.6.32-5-amd64 #1

Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [810b6324] 
? oom_kill_process+0x7f/0x23f
Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [810b6848] 
? __out_of_memory+0x12a/0x141
Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [810b699f] 
? out_of_memory+0x140/0x172
Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [810ba704] 
? __alloc_pages_nodemask+0x4ec/0x5fc
Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [812fb47a] 
? io_schedule+0x93/0xb7
Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [810bbc69] 
? __do_page_cache_readahead+0x9b/0x1b4
Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [81064fc0] 
? wake_bit_function+0x0/0x23
Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [810bbd9e] 
? ra_submit+0x1c/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [810b4a72] 
? filemap_fault+0x17d/0x2f6
Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [810ca9e2] 
? __do_fault+0x54/0x3c3
Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [810ccd36] 
? handle_mm_fault+0x3b8/0x80f
Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [8101166e] 
? apic_timer_interrupt+0xe/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [812febf6] 
? do_page_fault+0x2e0/0x2fc
Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [812fca95] 
? page_fault+0x25/0x30

Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0, 
btch:   1 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0, 
btch:   1 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0, 
btch:   1 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0, 
btch:   1 usd:   0

Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186, 
btch:  31 usd:   0

Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186, 
btch:  31 usd:  17
Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186, 
btch:  31 usd:   2
Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186, 
btch:  31 usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
inactive_anon:388557 isolated_anon:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209]  active_file:68 
inactive_file:236 isolated_file:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080210]  unevictable:0 
dirty:5 writeback:5 unstable:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080211]  free:16573 
slab_reclaimable:2398 slab_unreclaimable:2335
Mar 22 20:30:01 solr01-gs kernel: [716098.080212]  mapped:36 shmem:0 
pagetables:24750 bounce:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080575] Node 0 DMA 
free:15796kB min:16kB low:20kB high:24kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:15244kB mlocked:0kB 
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB 
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 22 20:30:01 solr01-gs kernel: [716098.081041] lowmem_reserve[]: 0 
3000 12090 12090
Mar 22 20:30:01 solr01-gs kernel: [716098.081110] Node 0 DMA32 
free:39824kB min:3488kB low:4360kB high:5232kB active_anon:2285240kB 
inactive_anon:520624kB active_file:0kB inactive_file:188kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:3072096kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB 
shmem:0kB slab_reclaimable:4152kB slab_unreclaimable:1640kB 
kernel_stack:1104kB pagetables:31100kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:89 all_unreclaimable? no
Mar 22 20:30:01 

Re: OutOfMemoryError

2013-03-25 Thread Bernd Fehling
The of UseG1GC yes,
but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM (1.7.0_07).
os.​arch: amd64
os.​name: Linux
os.​version: 2.6.32.13-0.5-xen

Only args are -XX:+UseG1GC -Xms16g -Xmx16g.
Monitoring shows that 16g is a bit high, I might reduce it to 10g or 12g for 
the slaves.
Start is at 5g, runtime is between 6 and 8g with some peaks to 9.5g.
Single index, 130GByte, 43.5 mio. dokuments.

Regards,
Bernd


Am 25.03.2013 11:55, schrieb Arkadi Colson:
 Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7? Any 
 extra options needed?
 
 Thanks...
 
 On 03/25/2013 08:34 AM, Arkadi Colson wrote:
 I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m as 
 parameters. I also added -XX:+UseG1GC to the java process. But now
 the whole machine crashes! Any idea why?

 Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked oom-killer: 
 gfp_mask=0x201da, order=0, oom_adj=0
 Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ 
 mems_allowed=0
 Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: java Not 
 tainted 2.6.32-5-amd64 #1
 Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
 Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [810b6324] ? 
 oom_kill_process+0x7f/0x23f
 Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [810b6848] ? 
 __out_of_memory+0x12a/0x141
 Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [810b699f] ? 
 out_of_memory+0x140/0x172
 Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [810ba704] ? 
 __alloc_pages_nodemask+0x4ec/0x5fc
 Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [812fb47a] ? 
 io_schedule+0x93/0xb7
 Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [810bbc69] ? 
 __do_page_cache_readahead+0x9b/0x1b4
 Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [81064fc0] ? 
 wake_bit_function+0x0/0x23
 Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [810bbd9e] ? 
 ra_submit+0x1c/0x20
 Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [810b4a72] ? 
 filemap_fault+0x17d/0x2f6
 Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [810ca9e2] ? 
 __do_fault+0x54/0x3c3
 Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [810ccd36] ? 
 handle_mm_fault+0x3b8/0x80f
 Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [8101166e] ? 
 apic_timer_interrupt+0xe/0x20
 Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [812febf6] ? 
 do_page_fault+0x2e0/0x2fc
 Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [812fca95] ? 
 page_fault+0x25/0x30
 Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0, btch:   1 
 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0, btch:   1 
 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0, btch:   1 
 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0, btch:   1 
 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186, btch:  
 31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186, btch:  
 31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186, btch:  
 31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186, btch:  
 31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186, btch:  
 31 usd:  17
 Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186, btch:  
 31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186, btch:  
 31 usd:   2
 Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186, btch:  
 31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
 inactive_anon:388557 isolated_anon:0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080209]  active_file:68 
 inactive_file:236 isolated_file:0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080210]  unevictable:0 dirty:5 
 writeback:5 unstable:0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080211]  free:16573 
 slab_reclaimable:2398 slab_unreclaimable:2335
 Mar 22 20:30:01 solr01-gs kernel: [716098.080212]  mapped:36 shmem:0 
 pagetables:24750 bounce:0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080575] Node 0 DMA free:15796kB 
 min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB
 active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB 
 isolated(file):0kB present:15244kB mlocked:0kB dirty:0kB writeback:0kB
 mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB 
 kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
 pages_scanned:0 all_unreclaimable? yes
 Mar 22 20:30:01 solr01-gs kernel: [716098.081041] 

Re: Tlog File not removed after hard commit

2013-03-25 Thread Michael Della Bitta
My understanding is that logs stick around for a while just in case they
can be used to catch up a shard that rejoins the cluster.
 On Mar 24, 2013 12:03 PM, Niran Fajemisin afa...@yahoo.com wrote:

 Hi all,

 We import about 1.5 million documents on a nightly basis using DIH. During
 this time, we need to ensure that all documents make it into index
 otherwise rollback on any errors; which DIH takes care of for us. We also
 disable autoCommit in DIH but instruct it to commit at the very end of the
 import. This is all done through configuration of the DIH config XML file
 and the command issued to the request handler.

 We have noticed that the tlog file appears to linger around even after DIH
 has issued the hard commit. My expectation would be that after the hard
 commit has occurred, the tlog file will be removed. I'm obviously
 misunderstanding how this all works.

 Can someone please help me understand how this is meant to function?
 Thanks!

 -Niran


Retriving results based on SOLR query data.

2013-03-25 Thread atuldj.jadhav
Hi Team,

I want to overcome a sort issue here.. sort feature works fine.

I have indexed few documents in SOLR.. which have a unique document ID.
Now when I retrieve result's from SOLR results comes automatically sorted. 

However I would like to fetch results based on the sequence I mention in my
SOLR query.

http://hostname:8080/SOLR/browse?q=documentID:D12133 OR documentID:D14423 OR
documentID:D912

I want results in same order...
 D12133 
 D14423 
 D912

Regards,
Atul



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Retriving-results-based-on-SOLR-query-data-tp4051076.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [ANNOUNCE] Solr wiki editing change

2013-03-25 Thread Steve Rowe
On Mar 25, 2013, at 3:30 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:
 Can you add me to? We have a few pages which we maintain (search results 
 clustering related). My wiki user is DawidWeiss

Added to AdminGroup.

On Mar 25, 2013, at 5:11 AM, Andrzej Bialecki a...@getopt.org wrote:
 Please add AndrzejBialecki to this group. Thank you!

Added to AdminGroup.

On Mar 25, 2013, at 5:48 AM, xie kidd xiezh...@gmail.com wrote:
 Please add adderllyer to this group. Thank you!

Added to ContributorsGroup.

Re: Timeout occured while waiting response from server

2013-03-25 Thread Erick Erickson
A timeout like this _probably_ means your docs were indexed just fine. I'm
curious why adding the docs takes so long, how many docs are you sending at
a time?

Best
Erick


On Thu, Mar 21, 2013 at 1:31 PM, Benjamin, Roy rbenja...@ebay.com wrote:

 I'm calling: m_server.add(docs, 12);

 Wondering if the timeout that expires was the one set when the server was
 created?

 m_server = new HttpSolrServer(serverUrl);
 m_server.setRequestWriter(new BinaryRequestWriter());
 m_server.setConnectionTimeout(3);
 m_server.setSoTimeout(1);

 Also, does the exception always mean the docs were not added?

 Thanks
 Roy

 Solr 3.6


 2013-03-21 10:21:32,487 [main] ERROR org.apache.pig.tools.grunt.Grunt -
 ERROR 2078: Caught error from UDF: checkout.regexudf.SolrAccumulator
 [org.apache.solr.client.solrj.SolrServerException: Timeout occured while
 waiting response from server at: http://10.94.238.86:8080/solr]




Re: Solr 4.2.0 results links

2013-03-25 Thread Erick Erickson
Solr doesn't do anything with links natively, it just echoes back what you
put in. So you're sending file-based http links to Solr...

Best
Erick


On Thu, Mar 21, 2013 at 1:40 PM, zeroeffect g.paul.r...@gmail.com wrote:

 While I am still in the beginning phase of solr I have been able to index a
 directory of HTML files. I can search keywords and get results. The problem
 I am having is the links to the HTML document is file based and http based.
 I get the link but it points to file:\\ and not http:\\. I have been
 looking
 for where to set this information. My setup is exporting database
 information to individual HTML files then FTP them to the solr server and
 have them indexed and accessed on our intranet.

 Thank you for your guidance.

 ZeroEffect



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-4-2-0-results-links-tp4049788.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: OutOfMemoryError

2013-03-25 Thread Arkadi Colson

Thanks for the info!
I just upgraded java from 6 to 7...
How exactly do you monitor the memory usage and the affect of the 
garbage collector?



On 03/25/2013 01:18 PM, Bernd Fehling wrote:

The of UseG1GC yes,
but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM (1.7.0_07).
os.​arch: amd64
os.​name: Linux
os.​version: 2.6.32.13-0.5-xen

Only args are -XX:+UseG1GC -Xms16g -Xmx16g.
Monitoring shows that 16g is a bit high, I might reduce it to 10g or 12g for 
the slaves.
Start is at 5g, runtime is between 6 and 8g with some peaks to 9.5g.
Single index, 130GByte, 43.5 mio. dokuments.

Regards,
Bernd


Am 25.03.2013 11:55, schrieb Arkadi Colson:

Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7? Any 
extra options needed?

Thanks...

On 03/25/2013 08:34 AM, Arkadi Colson wrote:

I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m as 
parameters. I also added -XX:+UseG1GC to the java process. But now
the whole machine crashes! Any idea why?

Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked oom-killer: 
gfp_mask=0x201da, order=0, oom_adj=0
Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ mems_allowed=0
Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: java Not 
tainted 2.6.32-5-amd64 #1
Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [810b6324] ? 
oom_kill_process+0x7f/0x23f
Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [810b6848] ? 
__out_of_memory+0x12a/0x141
Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [810b699f] ? 
out_of_memory+0x140/0x172
Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [810ba704] ? 
__alloc_pages_nodemask+0x4ec/0x5fc
Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [812fb47a] ? 
io_schedule+0x93/0xb7
Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [810bbc69] ? 
__do_page_cache_readahead+0x9b/0x1b4
Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [81064fc0] ? 
wake_bit_function+0x0/0x23
Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [810bbd9e] ? 
ra_submit+0x1c/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [810b4a72] ? 
filemap_fault+0x17d/0x2f6
Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [810ca9e2] ? 
__do_fault+0x54/0x3c3
Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [810ccd36] ? 
handle_mm_fault+0x3b8/0x80f
Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [8101166e] ? 
apic_timer_interrupt+0xe/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [812febf6] ? 
do_page_fault+0x2e0/0x2fc
Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [812fca95] ? 
page_fault+0x25/0x30
Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186, btch:  31 
usd:  17
Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186, btch:  31 
usd:   2
Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
inactive_anon:388557 isolated_anon:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209]  active_file:68 
inactive_file:236 isolated_file:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080210]  unevictable:0 dirty:5 
writeback:5 unstable:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080211]  free:16573 
slab_reclaimable:2398 slab_unreclaimable:2335
Mar 22 20:30:01 solr01-gs kernel: [716098.080212]  mapped:36 shmem:0 
pagetables:24750 bounce:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080575] Node 0 DMA free:15796kB 
min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:15244kB mlocked:0kB dirty:0kB writeback:0kB
mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB 
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 

Re: How can I compile and debug Solr from source code?

2013-03-25 Thread Erick Erickson
Furkan:

Stop. back up, you're making it too complicated. Follow Erik's
instructions. The ant example just compiles all of Solr, just like the
distribution. Then you can go into the example directory and change it to
look just like whatever you want, change the schema, change the solrconfig,
add custom components, etc. There's no difference between that and the
distro. It _is_ the distro just in a convenient form for running in Jetty.

So you create some custom code (say a filter or whatever). You put the path
to it in your solroconfig in a lib.../ directive. In fact I usually path
the lib directive out to wherever the code gets built by my IDE for
debugging purposes, then I don't have to copy the jar around.

I can then set breakpoints in my custom code. I can debug Solr as well.
It's just way cool.

About the only thing I'd add to Hatchers instructions is the possibility of
specifying suspend=y rather than suspend=n, and that's just if I want
to debug Solr startup code.

BTW, IntelliJ has, under the edit configurations section a remote
option that guides you through the flags etc that Erik pointed out. Eclipse
has similar but I use IntelliJ.

Best
Erick


On Thu, Mar 21, 2013 at 8:00 PM, Furkan KAMACI furkankam...@gmail.comwrote:

 Ok I run that and see that there is a .war file at

 /lucene-solr/solr/dist

 Do you know that how can I run that ant phase from Intellij without command
 line (there are many phases under Ant build window) On the other hand
 within Intellij Idea how can I auto deploy it into Tomcat. All in all I
 will edit configurations and it will run that ant command and deploy it to
 Tomcat itself?

 2013/3/22 Steve Rowe sar...@gmail.com

  Perhaps you didn't see what I wrote earlier?:
 
  Sounds like you want 'ant dist', which will create the .war and put it
  into the solr/dist/ directory:
 
  PROMPT$ ant dist
 
  Steve
 
  On Mar 21, 2013, at 7:38 PM, Furkan KAMACI furkankam...@gmail.com
 wrote:
 
   I mean I need that:  There is a .war file shipped with Solr source
 code.
   How can I regenerate (build my code and generate a .war file) as like
  that?
   I will deploy it to Tomcat then?
  
   2013/3/22 Furkan KAMACI furkankam...@gmail.com
  
   Your mentioned suggestion is for only example application? Can I imply
  it
   to just pure Solr (I don't want to generate example application
 because
  my
   aim is not just debugging Solr, I want to extend it and I will debug
  that
   extended code)?
  
  
   2013/3/22 Alexandre Rafalovitch arafa...@gmail.com
  
   That's nice. Can we put that on a Wiki? Or as a quick screencast?
  
   Regards,
 Alex.
  
   Personal blog: http://blog.outerthoughts.com/
   LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
   - Time is the quality of nature that keeps events from happening all
 at
   once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
  book)
  
  
   On Thu, Mar 21, 2013 at 5:42 PM, Erik Hatcher 
 erik.hatc...@gmail.com
   wrote:
  
   Here's my development/debug workflow:
  
- ant idea at the top-level to generate the IntelliJ project
- cd solr; ant example - to build the full example
- cd example; java -Xdebug
   -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005 -jar
   start.jar - to launch Jetty+Solr in debug mode
- set breakpoints in IntelliJ, set up a Remote run option
   (localhost:5005) in IntelliJ and debug pleasantly
  
   All the unit tests in Solr run very nicely in IntelliJ too, and for
   tight
   development loops, I spend my time doing that instead of running
 full
  on
   Solr.
  
  Erik
  
  
   On Mar 21, 2013, at 05:56 , Furkan KAMACI wrote:
  
   I use Intellij Idea 12 and Solr 4.1 on a Centos 6.4 64 bit
 computer.
  
   I have opened Solr source code at Intellij IDEA as explained
   documentation.
   I want to deploy Solr into Tomcat 7. When I open the project there
  are
   configurations set previosly (I used ant idea command before I open
   the
   project) . However they are all test configurations and some of
 them
   are
   not passed test (this is another issue, no need to go detail at
 this
   e-mail). I have added a Tomcat Local configuration into
  configurations
   but
   I don't know which one is the main method of Solr and is there any
   documentation that explains code. i.e. I want to debug a point what
   Solr
   receives from when I say -index from nutch and what Solr does?
  
   I tried somehing to run code (I don't think I could generate a .war
   or an
   exploded folder) an this is the error that I get:(I did't point any
   artifact for edit configurations)
  
   Error: Exception thrown by the agent :
  java.net.MalformedURLException:
   Local host name unknown: java.net.UnknownHostException: me.local:
   me.local:
   Name or service not known
  
   (me.local is the name I set when I install Centos 6.4 on my
 computer)
  
   Any ideas how to run source code will be nice for me.
  
  
  
  
  
 
 



Re: Continue to the next record

2013-03-25 Thread Erick Erickson
This has been a long-standing issue with updates, several attempts
have been started to change the behavior, but they haven't gotten
off the ground.

Your options are to send one record at a time, or have error-handling
logic that, say, transmits the docs one at a time whenever a packet fails.

Best
Erick


On Thu, Mar 21, 2013 at 9:21 PM, randolf.julian 
randolf.jul...@dominionenterprises.com wrote:

 I have an XML file that has several documents in it. For example:

 add
   doc
  field name=id1/field
  field name=name update=setMyName1/field
   /doc
   doc
  field name=id2/field
  field name=name update=setMyName2/field
   /doc
   doc
  field name=id3/field
  field name=name update=setMyName3/field
   /doc
 /add

 I upload the data using SOLR's post.sh script. For some reason, document 2
 failed and it cause the post.sh script to stop. How can I make it continue
 to the next document (3) even if it fails on 2?

 Thanks



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Continue-to-the-next-record-tp4049920.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr using a ridiculous amount of memory

2013-03-25 Thread John Nielsen
I apologize for the slow reply. Today has been killer. I will reply to
everyone as soon as I get the time.

I am having difficulties understanding how docValues work.

Should I only add docValues to the fields that I actually use for sorting
and faceting or on all fields?

Will the docValues magic apply to the fields i activate docValues on or on
the entire document when sorting/faceting on a field that has docValues
activated?

I'm not even sure which question to ask. I am struggling to understand this
on a conceptual level.


On Sun, Mar 24, 2013 at 7:11 PM, Robert Muir rcm...@gmail.com wrote:

 On Sun, Mar 24, 2013 at 4:19 AM, John Nielsen j...@mcb.dk wrote:

  Schema with DocValues attempt at solving problem:
  http://pastebin.com/Ne23NnW4
  Config: http://pastebin.com/x1qykyXW
 

 This schema isn't using docvalues, due to a typo in your config.
 it should not be DocValues=true but docValues=true.

 Are you not getting an error? Solr needs to throw exception if you
 provide invalid attributes to the field. Nothing is more frustrating
 than having a typo or something in your configuration and solr just
 ignores this, reports no error, and doesnt work the way you want.
 I'll look into this (I already intend to add these checks to analysis
 factories for the same reason).

 Separately, if you really want the terms data and so on to remain on
 disk, it is not enough to just enable docvalues for the field. The
 default implementation uses the heap. So if you want that, you need to
 set docValuesFormat=Disk on the fieldtype. This will keep the
 majority of the data on disk, and only some key datastructures in heap
 memory. This might have significant performance impact depending upon
 what you are doing so you need to test that.




-- 
Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk


RE: SOLR - Unable to execute query error - DIH

2013-03-25 Thread Dyer, James
With MS SqlServer, try adding selectMethod=cursor to your conenction string 
and set your batch size to a reasonable amount (possibly just omit it and DIH 
has a default value it will use.)

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: kobe.free.wo...@gmail.com [mailto:kobe.free.wo...@gmail.com] 
Sent: Monday, March 25, 2013 3:25 AM
To: solr-user@lucene.apache.org
Subject: SOLR - Unable to execute query error - DIH

Hello All,

I am trying to index data from SQL Server view to the SOLR using the DIH
with full-import command. The view has 750K rows and 427 columns. During the
first execution i indexed only the first 50 rows of the view, the data got
indexed in 10 min. But, when i executed the same scenario to index the
complete set of 750K rows, the execution continued for 2 days and
roll-backed, giving me the following error:

Unable to execute the query: select * from.

Following is my DIH configuration file,

dataConfig
  dataSource type=JdbcDataSource
driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
url=jdbc:sqlserver://server1\sql2012;databaseName=DBName user=x
password=x /
  document name=Search batchsize=1
entity name=Search query=select top 500 * from view
   field column=ID name=Id /

As suggested in some of the posts, i did try with batchsize=-1, but dint
work out. Please suggest is this the correct approach or any parameter needs
to be modified for tuning.

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028.html
Sent from the Solr - User mailing list archive at Nabble.com.




Contributors Group

2013-03-25 Thread Swati Swoboda
Hello,

Can I be added to the contributors group? Username sswoboda.

Thank you.

Swati


Re: Contributors Group

2013-03-25 Thread Steve Rowe

On Mar 25, 2013, at 10:32 AM, Swati Swoboda sswob...@igloosoftware.com wrote:
 Can I be added to the contributors group? Username sswoboda.

Added to solr ContributorsGroup.

Re: Continue to the next record

2013-03-25 Thread randolf.julian
Erick,

Thanks for the info. That's also what I had in mind and that's what I did
since I can't find anything on the web regarding this issue.

Randolf



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Continue-to-the-next-record-tp4049920p4051113.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4 automatic DB updates for sync using Delta query DIH with scheduler

2013-03-25 Thread majiedahamed
Hi,

Please let me know how to get the db changes reflected into my solr
index,Iam using Solr4 with DIH and delta query with scheduler in dataimport
scheduler properties.Ultimately i want my DB to be in sync with solr

Everything is all set and working except Every time i modify the data in the
DB column my scheduler automatically creates new index to the solr,I
therefore get two values with different  _version_.What iam looking is the
index get updated as and when the DB colums are updated.Kindly assist...

with regards
majied



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-automatic-DB-updates-for-sync-using-Delta-query-DIH-with-scheduler-tp4051114.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: OutOfMemoryError

2013-03-25 Thread Arkadi Colson
How can I see if GC is actually working? Is it written in the tomcat 
logs as well or will I only see it in the memory graphs?


BR,
Arkadi
On 03/25/2013 03:50 PM, Bernd Fehling wrote:

We use munin with jmx plugin for monitoring all server and Solr installations.
(http://munin-monitoring.org/)

Only for short time monitoring we also use jvisualvm delivered with Java SE JDK.

Regards
Bernd

Am 25.03.2013 14:45, schrieb Arkadi Colson:

Thanks for the info!
I just upgraded java from 6 to 7...
How exactly do you monitor the memory usage and the affect of the garbage 
collector?


On 03/25/2013 01:18 PM, Bernd Fehling wrote:

The of UseG1GC yes,
but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM (1.7.0_07).
os.​arch: amd64
os.​name: Linux
os.​version: 2.6.32.13-0.5-xen

Only args are -XX:+UseG1GC -Xms16g -Xmx16g.
Monitoring shows that 16g is a bit high, I might reduce it to 10g or 12g for 
the slaves.
Start is at 5g, runtime is between 6 and 8g with some peaks to 9.5g.
Single index, 130GByte, 43.5 mio. dokuments.

Regards,
Bernd


Am 25.03.2013 11:55, schrieb Arkadi Colson:

Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7? Any 
extra options needed?

Thanks...

On 03/25/2013 08:34 AM, Arkadi Colson wrote:

I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m as 
parameters. I also added -XX:+UseG1GC to the java process. But now
the whole machine crashes! Any idea why?

Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked oom-killer: 
gfp_mask=0x201da, order=0, oom_adj=0
Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ mems_allowed=0
Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: java Not 
tainted 2.6.32-5-amd64 #1
Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [810b6324] ? 
oom_kill_process+0x7f/0x23f
Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [810b6848] ? 
__out_of_memory+0x12a/0x141
Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [810b699f] ? 
out_of_memory+0x140/0x172
Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [810ba704] ? 
__alloc_pages_nodemask+0x4ec/0x5fc
Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [812fb47a] ? 
io_schedule+0x93/0xb7
Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [810bbc69] ? 
__do_page_cache_readahead+0x9b/0x1b4
Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [81064fc0] ? 
wake_bit_function+0x0/0x23
Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [810bbd9e] ? 
ra_submit+0x1c/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [810b4a72] ? 
filemap_fault+0x17d/0x2f6
Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [810ca9e2] ? 
__do_fault+0x54/0x3c3
Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [810ccd36] ? 
handle_mm_fault+0x3b8/0x80f
Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [8101166e] ? 
apic_timer_interrupt+0xe/0x20
Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [812febf6] ? 
do_page_fault+0x2e0/0x2fc
Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [812fca95] ? 
page_fault+0x25/0x30
Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0, btch:   1 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186, btch:  31 
usd:  17
Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186, btch:  31 
usd:   2
Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186, btch:  31 
usd:   0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
inactive_anon:388557 isolated_anon:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080209]  active_file:68 
inactive_file:236 isolated_file:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080210]  unevictable:0 dirty:5 
writeback:5 unstable:0
Mar 22 20:30:01 solr01-gs kernel: [716098.080211]  free:16573 
slab_reclaimable:2398 slab_unreclaimable:2335
Mar 22 20:30:01 solr01-gs kernel: [716098.080212]  mapped:36 

Re: Slow queries for common terms

2013-03-25 Thread Erick Erickson
take a look here:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

looking at memory consumption can be a bit tricky to interpret with
MMapDirectory.

But you say I see the CPU working very hard which implies that your issue
is just scoring 90M documents. A way to test: try q=*:*fq=field:book. My
bet is that that will be much faster, in which case scoring is your
choke-point and you'll need to spread that load across more servers, i.e.
shard.

When running the above, make sure of a couple of things:
1 you haven't run the fq query before (or you have filterCache turned
completely off).
2 you _have_ run a query or two that warms up your low-level caches.
Doesn't matter what, just as long as it doesn't have an fq clause.

Best
Erick



On Sat, Mar 23, 2013 at 3:10 AM, David Parks davidpark...@yahoo.com wrote:

 I see the CPU working very hard, and at the same time I see 2 MB/sec disk
 access for that 15 seconds. I am not running it this instant, but it seems
 to me that there was more CPU cycles available, so unless it's an issue of
 not being able to multithread it any  further I'd say it's more IO related.

 I'm going to set up solr cloud and shard across the 2 servers I have
 available for now. It's not an optimal setup we have while we're in a
 private beta period, but maybe it'll improve things (I've got 2 servers
 with
 2x 4TB disks in raid-0 shared with the webservers).

 I'll work towards some improved IO performance and maybe more shards and
 see
 how things go. I'll also be able to up the RAM in just a couple of weeks.

 Are there any settings I should think of in terms of improving cache
 performance when I can give it say 10GB of RAM?

 Thanks, this has been tremendously helpful.

 David


 -Original Message-
 From: Tom Burton-West [mailto:tburt...@umich.edu]
 Sent: Saturday, March 23, 2013 1:38 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Slow queries for common terms

 Hi David and Jan,

 I wrote the blog post, and David, you are right, the problem we had was
 with
 phrase queries because our positions lists are so huge.  Boolean
 queries don't need to read the positions lists.   I think you need to
 determine whether you are CPU bound or I/O bound.It is possible that
 you are I/O bound and reading the term frequency postings for 90 million
 docs is taking a long time.  In that case, More memory in the machine (but
 not dedicated to Solr) might help because Solr relies on OS disk caching
 for
 caching the postings lists.  You would still need to do some cache warming
 with your most common terms.

 On the other hand as Jan pointed out, you may be cpu bound because Solr
 doesn't have early termination and has to rank all 90 million docs in order
 to show the top 10 or 25.

 Did you try the OR search to see if your CPU is at 100%?

 Tom

 On Fri, Mar 22, 2013 at 10:14 AM, Jan Høydahl jan@cominvent.com
 wrote:

  Hi
 
  There might not be a final cure with more RAM if you are CPU bound.
  Scoring 90M docs is some work. Can you check what's going on during
  those
  15 seconds? Is your CPU at 100%? Try an (foo OR bar OR baz) search
  which generates 100mill hits and see if that is slow too, even if you
  don't use frequent words.
 
  I'm sure you can find other frequent terms in your corpus which
  display similar behaviour, words which are even more frequent than
  book. Are you using AND as default operator? You will benefit from
  limiting the number of results as much as possible.
 
  The real solution is to shard across N number of servers, until you
  reach the desired performance for the desired indexing/querying load.
 
  --
  Jan Høydahl, search solution architect Cominvent AS -
  www.cominvent.com Solr Training - www.solrtraining.com
 
 




Re: Two problems (missing updates and timeouts)

2013-03-25 Thread Erick Erickson
For your first problem I'd be looking at the solr logs and verifying that
1 the update was sent
2 no stack traces are thrown
3 You probably already know all about commits, but just in case the commit
interval is passed.

For your second problem, I'm not quite sure where you're setting these
timeouts. SolrJ?

Best
Erick


On Sat, Mar 23, 2013 at 4:23 PM, Aaron Jensen aaronjen...@gmail.com wrote:

 Hi all,

 I'm having two problem with our solr implementation. I don't have a lot of
 detail about them because we're just starting to get into diagnosing them.
 I'm hoping for some help with that diagnosis, ideas, tips, whatever.

 Our stack:

 Rails
 Sunspot Solr
 sunspot_index_queue
 two solr servers, master and slave, all traffic currently going to master,
 slave is just a replication slave/backup.


 The first and biggest problem is that we occasionally lose updates.
 Something will get added to the database, it will trigger a solr update,
 but then we can't search for that thing. It's just gone. indexing that
 thing again will have it show up. There are a number of moving parts in our
 stack and this is a relatively new problem. It was working fine for 1.5
 years without a problem. We're considering adding a delayed job that will
 index anything that is newly created a second after it is created just to
 be sure but this is a giant hack. Any ideas around this would be helpful.



 The second problem is that we get occasional timeouts. These don't happen
 very often, maybe 5-7/day. Solr is serving at most like 350 requests per
 minute. Our timeouts are set to 2 seconds on read and 1 second on open.
 Average response time is around 20ms. It doesn't seem like any requests
 should be timing out but they are. I have no idea how to debug it either.
 Any ideas?

 Thanks,

 Aaron




Re: Solr 4.2 Incremental backups

2013-03-25 Thread Erick Erickson
That's essentially what replication does, only backs up parts of the index
that have changed. However, when segments merge, that might mean the entire
index needs to be replicated.

Best
Erick


On Sun, Mar 24, 2013 at 12:08 AM, Sandeep Kumar Anumalla 
sanuma...@etisalat.ae wrote:

 Hi,

 Is there any option to do Incremental backups in Solr 4.2?

 Thanks  Regards
 Sandeep A
 Ext : 02618-2856
 M : 0502493820


 
 The content of this email together with any attachments, statements and
 opinions expressed herein contains information that is private and
 confidential are intended for the named addressee(s) only. If you are not
 the addressee of this email you may not copy, forward, disclose or
 otherwise use it or any part of it in any form whatsoever. If you have
 received this message in error please notify postmas...@etisalat.ae by
 email immediately and delete the message without making any copies.



Re: Too many fields to Sort in Solr

2013-03-25 Thread Erick Erickson
Certainly that will be true for the bare q=*:*, I meant with the boosting
clause added.

Best
Erick


On Sun, Mar 24, 2013 at 7:01 PM, adityab aditya_ba...@yahoo.com wrote:

 thanks Eric. in this query q=*:* the Lucene score is always 1



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Too-many-fields-to-Sort-in-Solr-tp4049139p4050944.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: OutOfMemoryError

2013-03-25 Thread Bernd Fehling
You can also use -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails 
-Xloggc:gc.log
as additional options to get a gc.log file and see what GC is doing.

Regards
Bernd

Am 25.03.2013 16:01, schrieb Arkadi Colson:
 How can I see if GC is actually working? Is it written in the tomcat logs as 
 well or will I only see it in the memory graphs?
 
 BR,
 Arkadi
 On 03/25/2013 03:50 PM, Bernd Fehling wrote:
 We use munin with jmx plugin for monitoring all server and Solr 
 installations.
 (http://munin-monitoring.org/)

 Only for short time monitoring we also use jvisualvm delivered with Java SE 
 JDK.

 Regards
 Bernd

 Am 25.03.2013 14:45, schrieb Arkadi Colson:
 Thanks for the info!
 I just upgraded java from 6 to 7...
 How exactly do you monitor the memory usage and the affect of the garbage 
 collector?


 On 03/25/2013 01:18 PM, Bernd Fehling wrote:
 The of UseG1GC yes,
 but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM 
 (1.7.0_07).
 os.​arch: amd64
 os.​name: Linux
 os.​version: 2.6.32.13-0.5-xen

 Only args are -XX:+UseG1GC -Xms16g -Xmx16g.
 Monitoring shows that 16g is a bit high, I might reduce it to 10g or 12g 
 for the slaves.
 Start is at 5g, runtime is between 6 and 8g with some peaks to 9.5g.
 Single index, 130GByte, 43.5 mio. dokuments.

 Regards,
 Bernd


 Am 25.03.2013 11:55, schrieb Arkadi Colson:
 Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7? 
 Any extra options needed?

 Thanks...

 On 03/25/2013 08:34 AM, Arkadi Colson wrote:
 I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m as 
 parameters. I also added -XX:+UseG1GC to the java process. But now
 the whole machine crashes! Any idea why?

 Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked 
 oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
 Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ 
 mems_allowed=0
 Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: java 
 Not tainted 2.6.32-5-amd64 #1
 Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
 Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [810b6324] ? 
 oom_kill_process+0x7f/0x23f
 Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [810b6848] ? 
 __out_of_memory+0x12a/0x141
 Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [810b699f] ? 
 out_of_memory+0x140/0x172
 Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [810ba704] ? 
 __alloc_pages_nodemask+0x4ec/0x5fc
 Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [812fb47a] ? 
 io_schedule+0x93/0xb7
 Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [810bbc69] ? 
 __do_page_cache_readahead+0x9b/0x1b4
 Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [81064fc0] ? 
 wake_bit_function+0x0/0x23
 Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [810bbd9e] ? 
 ra_submit+0x1c/0x20
 Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [810b4a72] ? 
 filemap_fault+0x17d/0x2f6
 Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [810ca9e2] ? 
 __do_fault+0x54/0x3c3
 Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [810ccd36] ? 
 handle_mm_fault+0x3b8/0x80f
 Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [8101166e] ? 
 apic_timer_interrupt+0xe/0x20
 Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [812febf6] ? 
 do_page_fault+0x2e0/0x2fc
 Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [812fca95] ? 
 page_fault+0x25/0x30
 Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0, btch: 
   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0, btch: 
   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0, btch: 
   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0, btch: 
   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186, 
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186, 
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186, 
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186, 
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186, 
 btch:  31 usd:  17
 Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186, 
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186, 
 btch:  31 usd:   2
 Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186, 
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
 inactive_anon:388557 isolated_anon:0
 Mar 22 20:30:01 solr01-gs kernel: 

Re: Undefined field problem.

2013-03-25 Thread Erick Erickson
unless you're manually typing things and did a typo, your problem is that
your csv file defines:

active_cruises
and your schema has
active_cruise

Note the lack of an 's'...

Best
Erick


On Mon, Mar 25, 2013 at 6:30 AM, Mid Night mid...@gmail.com wrote:

 Further to the prev msg:  Here's an extract from my current schema.xml:

field name=show_en type=boolean indexed=true stored=false
 required=true /
field name=active_cruise type=boolean indexed=true
 stored=true/
field name=non_grata type=boolean indexed=true stored=true/
field name=toptipp type=int indexed=true stored=true/



 The original schema.xml had the last 3 fields in the order toptipp,
 active_cruise and non_grata.  Active_cruise and non_grata were also defined
 as type=int.  I changed the order and field types in my attempts to fix
 the error.





 On 25 March 2013 11:21, Mid Night mid...@gmail.com wrote:

  Hi,
 
 
  I recently added a new field (toptipp) to an existing solr schema.xml and
  it worked just fine.  Subsequently I added to more fields (active_cruises
  and non_grata) to the schema and now I get this error:
 
  ?xml version=1.0 encoding=UTF-8?
  response
  lst name=responseHeaderint name=status400/intint
 name=QTime6/int/lstlst name=errorstr name=msgundefined
 field: active_cruise/strint name=code400/int/lst
  /response
 
 
  My solr db is populated via a program that creates and uploads a csv
  file.  When I view the csv file, the field active_cruises (given as
  undefined above), is populated correctly.  As far as I can tell, when I
  added the final fields to the schema, I did exactly the same as when I
  added toptipp.  I updated schema.xml and restarted solr (java -jar
  start.jar).
 
  I am really at a loss here.  Can someone please help with the answer or
 by
  pointing me in the right direction?  Naturally I'd be happy to provide
  further info if needed.
 
 
  Thanks
  MK
 
 
 
 
 
 
 
 



Re: Tlog File not removed after hard commit

2013-03-25 Thread Erick Erickson
The tlogs will stay there to provide peer synch on the last 100 docs. Say
a node somehow gets out of synch. There are two options
1 replay from the log
2 replicate the entire index.

To avoid 2 if possible, the tlog is kept around. In your case, all your
data is put in the tlog file, so the keep the last 100 docs available
rule means you'll keep the entire log for the run around until the _next_
run completes, at which point I'd expect the oldest one to be deleted.

Best
Erick


On Mon, Mar 25, 2013 at 8:40 AM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 My understanding is that logs stick around for a while just in case they
 can be used to catch up a shard that rejoins the cluster.
  On Mar 24, 2013 12:03 PM, Niran Fajemisin afa...@yahoo.com wrote:

  Hi all,
 
  We import about 1.5 million documents on a nightly basis using DIH.
 During
  this time, we need to ensure that all documents make it into index
  otherwise rollback on any errors; which DIH takes care of for us. We also
  disable autoCommit in DIH but instruct it to commit at the very end of
 the
  import. This is all done through configuration of the DIH config XML file
  and the command issued to the request handler.
 
  We have noticed that the tlog file appears to linger around even after
 DIH
  has issued the hard commit. My expectation would be that after the hard
  commit has occurred, the tlog file will be removed. I'm obviously
  misunderstanding how this all works.
 
  Can someone please help me understand how this is meant to function?
  Thanks!
 
  -Niran



Re: Retriving results based on SOLR query data.

2013-03-25 Thread Erick Erickson
There's no good way that I know of to have Solr do that for you.

But you have the original query so it seems like your app layer could sort
the results accordingly.

Best
Erick


On Mon, Mar 25, 2013 at 8:44 AM, atuldj.jadhav atuldj.jad...@gmail.comwrote:

 Hi Team,

 I want to overcome a sort issue here.. sort feature works fine.

 I have indexed few documents in SOLR.. which have a unique document ID.
 Now when I retrieve result's from SOLR results comes automatically sorted.

 However I would like to fetch results based on the sequence I mention in my
 SOLR query.

 http://hostname:8080/SOLR/browse?q=documentID:D12133 OR documentID:D14423
 OR
 documentID:D912

 I want results in same order...
  D12133
  D14423
  D912

 Regards,
 Atul



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Retriving-results-based-on-SOLR-query-data-tp4051076.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Query slow with termVectors termPositions termOffsets

2013-03-25 Thread Ravi Solr
Hello,
We re-indexed our entire core of 115 docs with some of the
fields having termVectors=true termPositions=true termOffsets=true,
prior to the reindex we only had termVectors=true. After the reindex the
the query component has become very slow. I thought that adding the
termOffsets and termPositions will increase the speed, am I wrong ? Several
queries like the one shown below which used to run fine are now very slow.
Can somebody kindly clarify how termOffsets and termPositions affect query
component ?

lst name=processdouble name=time19076.0/double
 lst name=org.apache.solr.handler.component.QueryComponentdouble
name=time18972.0/double/lst
lst name=org.apache.solr.handler.component.FacetComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.MoreLikeThisComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.HighlightComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.StatsComponentdouble
name=time0.0/double/lst
lst
name=org.apache.solr.handler.component.QueryElevationComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.clustering.ClusteringComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.DebugComponentdouble
name=time104.0/double/lst
/lst


[#|2013-03-25T11:22:53.446-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=45;_ThreadName=httpSSLWorkerThread-9001-19;|[xxx]
webapp=/solr-admin path=/select
params={q=primarysectionnode:(/national*+OR+/health*)+OR+(contenttype:Blog+AND+subheadline:(The+Checkup+OR+Checkpoint+Washington+OR+Post+Carbon+OR+TSA+OR+College+Inc.+OR+Campus+Overload+OR+Planet+Panel+OR+The+Answer+Sheet+OR+Class+Struggle+OR+BlogPost))+OR+(contenttype:Photo+Gallery+AND+headline:day+in+photos)start=0rows=1sort=displaydatetime+descfq=-source:(Reuters+OR+PC+World+OR+CBS+News+OR+NC8/WJLA+OR+NewsChannel+8+OR+NC8+OR+WJLA+OR+CBS)+-contenttype:(Discussion+OR+Photo)+-slug:(op-*dummy*+OR+noipad-*)+-(contenttype:Photo+Gallery+AND+headline:(Drawing+Board+OR+Drawing+board+OR+drawing+board))+headline:[*+TO+*]+contenttype:[*+TO+*]+pubdatetime:[NOW/DAY-3YEARS+TO+NOW/DAY%2B1DAY]+-headline:(Summary+Box*+OR+Video*+OR+Post+Sports+Live*)+-slug:(warren*+OR+history)+-(contenttype:Blog+AND+subheadline:(DC+Schools+Insider+OR+On+Leadership))+contenttype:Blog+-systemid:(999c7102-955a-11e2-95ca-dd43e7ffee9c+OR+72bbb724-9554-11e2-95ca-dd43e7ffee9c+OR+2d008b80-9520-11e2-95ca-dd43e7ffee9c+OR+d2443d3c-9514-11e2-95ca-dd43e7ffee9c+OR+173764d6-9520-11e2-95ca-dd43e7ffee9c+OR+0181fd42-953c-11e2-95ca-dd43e7ffee9c+OR+e6cacb96-9559-11e2-95ca-dd43e7ffee9c+OR+03288052-9501-11e2-95ca-dd43e7ffee9c+OR+ddbf020c-9517-11e2-95ca-dd43e7ffee9c)+fullbody:[*+TO+*]wt=javabinversion=2}
hits=4985 status=0 QTime=19044 |#]

Thanks,

Ravi Kiran Bhaskar


Re: Undefined field problem.

2013-03-25 Thread Jack Krupansky
Generally, you will need to delete the index and completely reindex your 
data if you change the type of a field.


I don't think that would account for active_cruise being an undefined field 
though.


I did try your scenario with the Solr 4.2 example, and a field named 
active_cruise, and it worked fine for me. The only issue was that existing 
data (e.g., 1 in the int field) was all considered as boolean false after I 
changed the schema and restarted.


-- Jack Krupansky

-Original Message- 
From: Mid Night

Sent: Monday, March 25, 2013 6:30 AM
To: solr-user@lucene.apache.org
Subject: Re: Undefined field problem.

Further to the prev msg:  Here's an extract from my current schema.xml:

  field name=show_en type=boolean indexed=true stored=false
required=true /
  field name=active_cruise type=boolean indexed=true stored=true/
  field name=non_grata type=boolean indexed=true stored=true/
  field name=toptipp type=int indexed=true stored=true/



The original schema.xml had the last 3 fields in the order toptipp,
active_cruise and non_grata.  Active_cruise and non_grata were also defined
as type=int.  I changed the order and field types in my attempts to fix
the error.





On 25 March 2013 11:21, Mid Night mid...@gmail.com wrote:


Hi,


I recently added a new field (toptipp) to an existing solr schema.xml and
it worked just fine.  Subsequently I added to more fields (active_cruises
and non_grata) to the schema and now I get this error:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status400/intint 
name=QTime6/int/lstlst name=errorstr name=msgundefined 
field: active_cruise/strint name=code400/int/lst

/response


My solr db is populated via a program that creates and uploads a csv
file.  When I view the csv file, the field active_cruises (given as
undefined above), is populated correctly.  As far as I can tell, when I
added the final fields to the schema, I did exactly the same as when I
added toptipp.  I updated schema.xml and restarted solr (java -jar
start.jar).

I am really at a loss here.  Can someone please help with the answer or by
pointing me in the right direction?  Naturally I'd be happy to provide
further info if needed.


Thanks
MK












Re: Contributors Group

2013-03-25 Thread Upayavira
While you're in that mode, could you please add 'Upayavira'.

Thanks!

Upayavira

On Mon, Mar 25, 2013, at 02:41 PM, Steve Rowe wrote:
 
 On Mar 25, 2013, at 10:32 AM, Swati Swoboda sswob...@igloosoftware.com
 wrote:
  Can I be added to the contributors group? Username sswoboda.
 
 Added to solr ContributorsGroup.


Re: Contributors Group

2013-03-25 Thread Steve Rowe
On Mar 25, 2013, at 11:59 AM, Upayavira u...@odoko.co.uk wrote:
 While you're in that mode, could you please add 'Upayavira'.

Added to solr ContributorsGroup.


lucene 42 codec

2013-03-25 Thread Mario Casola
Hi,

I noticed that apache solr 4.2 uses the lucene codec 4.1. How can I
switch to 4.2?

Thanks in advance
Mario


Re: Query slow with termVectors termPositions termOffsets

2013-03-25 Thread alxsss
Did index size increase after turning on termPositions and termOffsets?

Thanks.
Alex.

 

 

 

-Original Message-
From: Ravi Solr ravis...@gmail.com
To: solr-user solr-user@lucene.apache.org
Sent: Mon, Mar 25, 2013 8:27 am
Subject: Query slow with termVectors termPositions termOffsets


Hello,
We re-indexed our entire core of 115 docs with some of the
fields having termVectors=true termPositions=true termOffsets=true,
prior to the reindex we only had termVectors=true. After the reindex the
the query component has become very slow. I thought that adding the
termOffsets and termPositions will increase the speed, am I wrong ? Several
queries like the one shown below which used to run fine are now very slow.
Can somebody kindly clarify how termOffsets and termPositions affect query
component ?

lst name=processdouble name=time19076.0/double
 lst name=org.apache.solr.handler.component.QueryComponentdouble
name=time18972.0/double/lst
lst name=org.apache.solr.handler.component.FacetComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.MoreLikeThisComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.HighlightComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.StatsComponentdouble
name=time0.0/double/lst
lst
name=org.apache.solr.handler.component.QueryElevationComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.clustering.ClusteringComponentdouble
name=time0.0/double/lst
lst name=org.apache.solr.handler.component.DebugComponentdouble
name=time104.0/double/lst
/lst


[#|2013-03-25T11:22:53.446-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=45;_ThreadName=httpSSLWorkerThread-9001-19;|[xxx]
webapp=/solr-admin path=/select
params={q=primarysectionnode:(/national*+OR+/health*)+OR+(contenttype:Blog+AND+subheadline:(The+Checkup+OR+Checkpoint+Washington+OR+Post+Carbon+OR+TSA+OR+College+Inc.+OR+Campus+Overload+OR+Planet+Panel+OR+The+Answer+Sheet+OR+Class+Struggle+OR+BlogPost))+OR+(contenttype:Photo+Gallery+AND+headline:day+in+photos)start=0rows=1sort=displaydatetime+descfq=-source:(Reuters+OR+PC+World+OR+CBS+News+OR+NC8/WJLA+OR+NewsChannel+8+OR+NC8+OR+WJLA+OR+CBS)+-contenttype:(Discussion+OR+Photo)+-slug:(op-*dummy*+OR+noipad-*)+-(contenttype:Photo+Gallery+AND+headline:(Drawing+Board+OR+Drawing+board+OR+drawing+board))+headline:[*+TO+*]+contenttype:[*+TO+*]+pubdatetime:[NOW/DAY-3YEARS+TO+NOW/DAY%2B1DAY]+-headline:(Summary+Box*+OR+Video*+OR+Post+Sports+Live*)+-slug:(warren*+OR+history)+-(contenttype:Blog+AND+subheadline:(DC+Schools+Insider+OR+On+Leadership))+contenttype:Blog+-systemid:(999c7102-955a-11e2-95ca-dd43e7ffee9c+OR+72bbb724-9554-11e2-95ca-dd43e7ffee9c+OR+2d008b80-9520-11e2-95ca-dd43e7ffee9c+OR+d2443d3c-9514-11e2-95ca-dd43e7ffee9c+OR+173764d6-9520-11e2-95ca-dd43e7ffee9c+OR+0181fd42-953c-11e2-95ca-dd43e7ffee9c+OR+e6cacb96-9559-11e2-95ca-dd43e7ffee9c+OR+03288052-9501-11e2-95ca-dd43e7ffee9c+OR+ddbf020c-9517-11e2-95ca-dd43e7ffee9c)+fullbody:[*+TO+*]wt=javabinversion=2}
hits=4985 status=0 QTime=19044 |#]

Thanks,

Ravi Kiran Bhaskar

 


Error creating collection using CORE-API

2013-03-25 Thread yriveiro
Hi,

I'm having an issue when I trying to create a collection:


curl
http://192.168.1.142:8983/solr/admin/cores?action=CREATEname=RT-4A46DF1563_12collection=RT-4A46DF1563_12shard=00collection.configName=reportssBucket-regular


The curl call has an error because the collection.configName doesn't exists,
then I fixed the curl call to:


curl
http://192.168.1.142:8983/solr/admin/cores?action=CREATEname=RT-4A46DF1563_12collection=RT-4A46DF1563_12shard=00collection.configName=reportsBucket-regular


But now I have this stacktrace:

INFO: Creating SolrCore 'RT-4A46DF1563_12' using instanceDir:
/Users/yriveiro/Dump/solrCloud/node00.solrcloud/solr/home/RT-4A46DF1563_12
Mar 25, 2013 5:15:35 PM org.apache.solr.cloud.ZkController
createCollectionZkNode
INFO: Check for collection zkNode:RT-4A46DF1563_12
Mar 25, 2013 5:15:35 PM org.apache.solr.cloud.ZkController
createCollectionZkNode
INFO: Collection zkNode exists
Mar 25, 2013 5:15:35 PM org.apache.solr.cloud.ZkController readConfigName
INFO: Load collection config from:/collections/RT-4A46DF1563_12
Mar 25, 2013 5:15:35 PM org.apache.solr.cloud.ZkController readConfigName
SEVERE: Specified config does not exist in ZooKeeper:reportssBucket-regular
Mar 25, 2013 5:15:35 PM org.apache.solr.core.CoreContainer recordAndThrow
SEVERE: Unable to create core: RT-4A46DF1563_12
org.apache.solr.common.cloud.ZooKeeperException: Specified config does not
exist in ZooKeeper:reportssBucket-regular


In fact the collection is in zookeeper as a file and not as a folder, the
question here is: If the CREATE command doesn't find the config, why it's
created a file? and Why after this, I can't run the command again with the
correct syntax without remove the file create by the failed CREATE command?



-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-creating-collection-using-CORE-API-tp4051156.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Strange error in Solr 4.2

2013-03-25 Thread skp
I fixed it by setting JVM properties in glassfish.

-Djavax.net.ssl.keyStorePassword=changeit 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strange-error-in-Solr-4-2-tp4047386p4051159.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tlog File not removed after hard commit

2013-03-25 Thread Niran Fajemisin
Thanks Erick and Michael for the prompt responses.

Cheers,
Niran




 From: Erick Erickson erickerick...@gmail.com
To: solr-user@lucene.apache.org 
Sent: Monday, March 25, 2013 10:21 AM
Subject: Re: Tlog File not removed after hard commit
 
The tlogs will stay there to provide peer synch on the last 100 docs. Say
a node somehow gets out of synch. There are two options
1 replay from the log
2 replicate the entire index.

To avoid 2 if possible, the tlog is kept around. In your case, all your
data is put in the tlog file, so the keep the last 100 docs available
rule means you'll keep the entire log for the run around until the _next_
run completes, at which point I'd expect the oldest one to be deleted.

Best
Erick


On Mon, Mar 25, 2013 at 8:40 AM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 My understanding is that logs stick around for a while just in case they
 can be used to catch up a shard that rejoins the cluster.
  On Mar 24, 2013 12:03 PM, Niran Fajemisin afa...@yahoo.com wrote:

  Hi all,
 
  We import about 1.5 million documents on a nightly basis using DIH.
 During
  this time, we need to ensure that all documents make it into index
  otherwise rollback on any errors; which DIH takes care of for us. We also
  disable autoCommit in DIH but instruct it to commit at the very end of
 the
  import. This is all done through configuration of the DIH config XML file
  and the command issued to the request handler.
 
  We have noticed that the tlog file appears to linger around even after
 DIH
  has issued the hard commit. My expectation would be that after the hard
  commit has occurred, the tlog file will be removed. I'm obviously
  misunderstanding how this all works.
 
  Can someone please help me understand how this is meant to function?
  Thanks!
 
  -Niran





Re: Multi-core and replicated Solr cloud testing. Data-directory mis-configures

2013-03-25 Thread Trevor Campbell
That example does not work if you have  1 collection (core) per node, all
end up sharing the same index and overwrite one another.


On Mon, Mar 25, 2013 at 6:27 PM, Gopal Patwa gopalpa...@gmail.com wrote:

 if you use default directory then it will use solr.home directory, I have
 tested solr cloud example on local machine with 5-6 nodes.And data
 directory was created under core name, like

 example2/solr/collection1/data. you could see example startup script from
 source code solr/cloud-dev/solrcloud-multi-start.sh

 example solrconfig.xml

   dataDir${solr.data.dir:}/dataDir

 On Sun, Mar 24, 2013 at 10:44 PM, Trevor Campbell
 tcampb...@atlassian.comwrote:

  I have three indexes which I have set up as three separate cores, using
  this solr.xml config.
 
cores adminPath=/admin/cores host=${host:}
  hostPort=${jetty.port:}
  core name=jira-issue instanceDir=jira-issue 
 property name=dataDir value=jira-issue/data/ /
  /core
  core name=jira-comment instanceDir=jira-comment 
 property name=dataDir value=jira-comment/data/ /
  /core
  core name=jira-change-history instanceDir=jira-change-**history
 
 property name=dataDir value=jira-change-history/**data/ /
  /core
/cores
 
  This works just fine a standalone solr.
 
  I duplicated this setup on the same machine under a completely separate
  solr installation (solr-nodeb) and modified all the data directroies to
  point to the direstories in nodeb.  This all worked fine.
 
  I then connected the 2 instances together with zoo-keeper using settings
  -Dbootstrap_conf=true -Dcollection.configName=**jiraCluster -DzkRun
  -DnumShards=1 for the first intsance and -DzkHost=localhost:9080 for
   the second. (I'm using tomcat and ports 8080 and 8081 for the 2 Solr
  instances)
 
  Now the data directories of the second node point to the data directories
  in the first node.
 
  I have tried many settings in the solrconfig.xml for each core but am now
  using absolute paths, e.g.
  dataDir/home//solr-**4.2.0-nodeb/example/multicore/**
  jira-comment/data/dataDir
 
  previously I used
  ${solr.jira-comment.data.dir:/**home/tcampbell/solr-4.2.0-**
  nodeb/example/multicore/jira-**comment/data}
  but that had the same result.
 
  It seems zookeeper is forcing data directory config from the uploaded
  configuration on all the nodes in the cluster?
 
  How can I do testing on a single machine? Do I really need identical
  directory layouts on all machines?
 
 
 



Re: DocValues and field requirements

2013-03-25 Thread Marcin Rzewucki
Hi Chris,

Thanks for your detailed explanations. The default value is a difficult
limitation. Especially for financial figures. I may try with some
workaround like the lowest possible number for TrieLongField, but would be
better to avoid such :)

Regards.

On 22 March 2013 20:39, Chris Hostetter hossman_luc...@fucit.org wrote:


 : Thank you for your response. Yes, that's strange. By enabling DocValues
 the
 : information about missing fields is lost, which changes the way of
 sorting
 : as well. Adding default value to the fields can change a logic of
 : application dramatically (I can't set default value to 0 for all
 : Trie*Fields fields, because it could impact the results displayed to the
 : end user, which is not good). It's a pity that using DocValues is so
 : limited.

 I'm not really up on docvalues, but i asked rmuir about this a bit on IRC

 the crux of the issue is that there are two differnet docvalue impls, one
 that uses a fixed amount of space per doc (ie: exactly one value per doc)
 and one that alloaws an ordered set of values per doc (ie: multivalued).

 the multivalued docvals impl was wired into solr for multivalued fields,
 and the single valued docvals impl was wired in for hte single valued case
 -- but since since the single valued docvals impl *has* to have a value
 for every doc, the schema error you encountered was added if you try to
 use it on a field that isn't required or doesn't have a default value --
 to force you to be explicit about which default you want, instead of hte
 low level lucene 0 default coming into play w/o you knowing about it.
 (as Shawn mentioned)

 the multivalued docvals impl could concivably be used instead for these
 types of single valued fields (ie: to support 0 or 1 values) but there is
 no sorting support for multivalued docvals, so it would cause other
 problems.

 One possible workarround for people who want to take advantage of sort
 missing first/last type sorting on a docvals type field would be to mange
 the missing information yourself in a distinct field which you also
 leveraged in any filtering or sorting on the docvals field.

 ie, have a docvalues field myfield which is single valued, with some
 configured default value, and then have a myfield_exists boolean field
 which is single valued and required.  when indexing docs, if myfield
 does/doesn't have a value set myfield_exists to accordingly (this would
 be fairly trivial in an updated processor) and then instead of sorting
 just on myfield desc you would sort on myfield_exists (asc|desc),
 myfield desc (where you pick hte asc or desc depending on wether you want
 docs w/o values first or last).  you would likewise need to filter on
 myfield_exists:true anytime you did queries against the myfield field.


 (perhaps someoen could work on patch to inject a synthetic field like this
 automatically for fields that are docValues=true multiValued=false
 required=false w/o a defualtValue?)


 -Hoss



Accessing SolrZkClient instance from a plug-in?

2013-03-25 Thread Timothy Potter
I have a custom ValueSourceParser that sets up a Zookeeper Watcher on some
frequently changing metadata that a custom ValueSource depends on.

Basic flow of events is - VSP watches for metadata changes, which triggers
a refresh of some expensive data that my custom ValueSource uses at query
time. Think of the data in Zookeeper as a pointer to some larger dataset
that is computed offline and then loaded into memory for use by my custom
ValueSource.

In my ValueSourceParser, I connect to Zookeeper using an instance of the
SolrZkClient class and am receiving WatchedEvents when my metadata changes
(as expected).

All this works great until core reload happens. From what I can tell,
there's no shutdown hook for ValueSourceParsers, so what's happening is
that my code ends up adding multiple Watchers and thus receives multiple
update events when the metadata changes.

What I need is either

1) a shutdown hook in my VSP that allows me to clean-up the SolrZkClient
instance my code is managing, or

2) access to the ZkController instance owned by the CoreContainer from my
VSP.

For me #2 is better as I'd prefer to just re-use Solr's instance of
SolrZkClient.

I can go and hack either of these in pretty easily but wanted to see if
someone knows a better way to get 1 or 2?

In general, it might be handy to allow plug-ins to get access to the
Zookeeper client SolrCloud is using.

Thanks.
Tim


Re: Multi-core and replicated Solr cloud testing. Data-directory mis-configures

2013-03-25 Thread Trevor Campbell
Solved.

I was able to solve this by removing any reference to dataDir from the
solrconfig.xml.  So in solr.xml for each node I have:
  cores adminPath=/admin/cores host=${host:} hostPort=${jetty.port:}
core name=jira-issue instanceDir=jira-issue 
   property name=dataDir value=jira-issue/data/ /
/core
core name=jira-comment instanceDir=jira-comment 
   property name=dataDir value=jira-comment/data/ /
/core
core name=jira-change-history instanceDir=jira-change-history 
   property name=dataDir value=jira-change-history/data/ /
/core
  /cores

and in solrconfig.xml in each core I have removed the reference to dataDir
completely.
!-- dataDir${solr.core0.data.dir:}/dataDir --



On Tue, Mar 26, 2013 at 8:41 AM, Trevor Campbell tcampb...@atlassian.comwrote:

 That example does not work if you have  1 collection (core) per node, all
 end up sharing the same index and overwrite one another.


 On Mon, Mar 25, 2013 at 6:27 PM, Gopal Patwa gopalpa...@gmail.com wrote:

 if you use default directory then it will use solr.home directory, I have
 tested solr cloud example on local machine with 5-6 nodes.And data
 directory was created under core name, like

 example2/solr/collection1/data. you could see example startup script
 from
 source code solr/cloud-dev/solrcloud-multi-start.sh

 example solrconfig.xml

   dataDir${solr.data.dir:}/dataDir

 On Sun, Mar 24, 2013 at 10:44 PM, Trevor Campbell
 tcampb...@atlassian.comwrote:

  I have three indexes which I have set up as three separate cores, using
  this solr.xml config.
 
cores adminPath=/admin/cores host=${host:}
  hostPort=${jetty.port:}
  core name=jira-issue instanceDir=jira-issue 
 property name=dataDir value=jira-issue/data/ /
  /core
  core name=jira-comment instanceDir=jira-comment 
 property name=dataDir value=jira-comment/data/ /
  /core
  core name=jira-change-history
 instanceDir=jira-change-**history 
 property name=dataDir value=jira-change-history/**data/ /
  /core
/cores
 
  This works just fine a standalone solr.
 
  I duplicated this setup on the same machine under a completely separate
  solr installation (solr-nodeb) and modified all the data directroies to
  point to the direstories in nodeb.  This all worked fine.
 
  I then connected the 2 instances together with zoo-keeper using settings
  -Dbootstrap_conf=true -Dcollection.configName=**jiraCluster -DzkRun
  -DnumShards=1 for the first intsance and -DzkHost=localhost:9080 for
   the second. (I'm using tomcat and ports 8080 and 8081 for the 2 Solr
  instances)
 
  Now the data directories of the second node point to the data
 directories
  in the first node.
 
  I have tried many settings in the solrconfig.xml for each core but am
 now
  using absolute paths, e.g.
  dataDir/home//solr-**4.2.0-nodeb/example/multicore/**
  jira-comment/data/dataDir
 
  previously I used
  ${solr.jira-comment.data.dir:/**home/tcampbell/solr-4.2.0-**
  nodeb/example/multicore/jira-**comment/data}
  but that had the same result.
 
  It seems zookeeper is forcing data directory config from the uploaded
  configuration on all the nodes in the cluster?
 
  How can I do testing on a single machine? Do I really need identical
  directory layouts on all machines?
 
 
 





Re: Accessing SolrZkClient instance from a plug-in?

2013-03-25 Thread Mark Miller
I don't know the ValueSourceParser from a hole in my head, but it looks like it 
has access to the solrcore with fp.req.getCore?

If so, it's easy to get the zk stuff

core.getCoreDescriptor.getCoreContainer.getZkController(.getZkClient).

From memory, so perhaps with some minor misname.

- Mark

On Mar 25, 2013, at 6:03 PM, Timothy Potter thelabd...@gmail.com wrote:

 I have a custom ValueSourceParser that sets up a Zookeeper Watcher on some
 frequently changing metadata that a custom ValueSource depends on.
 
 Basic flow of events is - VSP watches for metadata changes, which triggers
 a refresh of some expensive data that my custom ValueSource uses at query
 time. Think of the data in Zookeeper as a pointer to some larger dataset
 that is computed offline and then loaded into memory for use by my custom
 ValueSource.
 
 In my ValueSourceParser, I connect to Zookeeper using an instance of the
 SolrZkClient class and am receiving WatchedEvents when my metadata changes
 (as expected).
 
 All this works great until core reload happens. From what I can tell,
 there's no shutdown hook for ValueSourceParsers, so what's happening is
 that my code ends up adding multiple Watchers and thus receives multiple
 update events when the metadata changes.
 
 What I need is either
 
 1) a shutdown hook in my VSP that allows me to clean-up the SolrZkClient
 instance my code is managing, or
 
 2) access to the ZkController instance owned by the CoreContainer from my
 VSP.
 
 For me #2 is better as I'd prefer to just re-use Solr's instance of
 SolrZkClient.
 
 I can go and hack either of these in pretty easily but wanted to see if
 someone knows a better way to get 1 or 2?
 
 In general, it might be handy to allow plug-ins to get access to the
 Zookeeper client SolrCloud is using.
 
 Thanks.
 Tim



Re: lucene 42 codec

2013-03-25 Thread Chris Hostetter

: I noticed that apache solr 4.2 uses the lucene codec 4.1. How can I
: switch to 4.2?

Unless you've configured something oddly, Solr is already using the 4.2 
codec.  

What you are probably seeing is that the fileformat for several types of 
files hasn't changed from the 4.1 (or even 4.0) versions, so they are 
still used in 4.2 (and confusingly include Lucene41 in the filenames in 
several cases).

Note that in the 4.2 codec package javadocs, several codec related classes 
are not implemented, and the docs link back to the 4.1 and 4.0 
implementations...

https://lucene.apache.org/core/4_2_0/core/org/apache/lucene/codecs/lucene42/package-summary.html

If you peek inside the Lucene42Codec class you'll also see...

  private final StoredFieldsFormat fieldsFormat = new 
Lucene41StoredFieldsFormat();
  private final TermVectorsFormat vectorsFormat = new 
Lucene42TermVectorsFormat();
  private final FieldInfosFormat fieldInfosFormat = new 
Lucene42FieldInfosFormat();
  private final SegmentInfoFormat infosFormat = new Lucene40SegmentInfoFormat();
  private final LiveDocsFormat liveDocsFormat = new Lucene40LiveDocsFormat();

-Hoss


Re: Accessing SolrZkClient instance from a plug-in?

2013-03-25 Thread Timothy Potter
Brilliant! Thank you - I was focusing on the init method and totally
ignored the FunctionQParser passed to the parse method.

Cheers,
Tim

On Mon, Mar 25, 2013 at 4:16 PM, Mark Miller markrmil...@gmail.com wrote:

 I don't know the ValueSourceParser from a hole in my head, but it looks
 like it has access to the solrcore with fp.req.getCore?

 If so, it's easy to get the zk stuff

 core.getCoreDescriptor.getCoreContainer.getZkController(.getZkClient).

 From memory, so perhaps with some minor misname.

 - Mark

 On Mar 25, 2013, at 6:03 PM, Timothy Potter thelabd...@gmail.com wrote:

  I have a custom ValueSourceParser that sets up a Zookeeper Watcher on
 some
  frequently changing metadata that a custom ValueSource depends on.
 
  Basic flow of events is - VSP watches for metadata changes, which
 triggers
  a refresh of some expensive data that my custom ValueSource uses at query
  time. Think of the data in Zookeeper as a pointer to some larger dataset
  that is computed offline and then loaded into memory for use by my custom
  ValueSource.
 
  In my ValueSourceParser, I connect to Zookeeper using an instance of the
  SolrZkClient class and am receiving WatchedEvents when my metadata
 changes
  (as expected).
 
  All this works great until core reload happens. From what I can tell,
  there's no shutdown hook for ValueSourceParsers, so what's happening is
  that my code ends up adding multiple Watchers and thus receives multiple
  update events when the metadata changes.
 
  What I need is either
 
  1) a shutdown hook in my VSP that allows me to clean-up the SolrZkClient
  instance my code is managing, or
 
  2) access to the ZkController instance owned by the CoreContainer from my
  VSP.
 
  For me #2 is better as I'd prefer to just re-use Solr's instance of
  SolrZkClient.
 
  I can go and hack either of these in pretty easily but wanted to see if
  someone knows a better way to get 1 or 2?
 
  In general, it might be handy to allow plug-ins to get access to the
  Zookeeper client SolrCloud is using.
 
  Thanks.
  Tim




Any experience with adding documents batch sizes?

2013-03-25 Thread Benjamin, Roy
My application is update intensive.  The documents are pretty small, less than 
1K bytes.

Just now I'm batching 4K documents with each SolrJ addDocs() call.

Wondering what I should expect with increasing this batch size?  Say 8K docs 
per update?

Thanks

Roy


Solr 3.6





Re: status 400 on posting json

2013-03-25 Thread Patrice Seyed
Hi Jack, I tried putting the schema.xml file (further below) in the
path you specified below, but when i tried to start (java -jar
start.jar) got the message below.

I can try a fresh install like you suggested, but I'm not sure what
would be different. I was using documenationt at
http://lucene.apache.org/solr/4_1_0/tutorial.html using the binary
from zip. Are you suggesting building from source and/or some other
approach? Also, what is the best documentation currently for 4.1
install (for mac), (there are a lot of sites out there.) Thanks in
advance. -Patrice

SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: Unknown fieldtype 'string'
specified on field id
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:390)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Mar 25, 2013 7:14:53 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Unable to create
core: collection1
at 
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1654)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1039)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.apache.solr.common.SolrException: Unknown fieldtype
'string' specified on field id
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:390)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
... 10 more


---

Here's the normal path to the example configuration in Solr 4.1:

.../solr-4.1.0/example/solr/collection1/conf

That's the directory in which the example schema.xml and other
configuration files live.

There is no solr-4.1.0/example/conf directory, unless you managed to
create one yourself.

I suggest that you start with a fresh install of Solr 4.1

As far as keywords, the existing field is set up to be a
comma-separated list of keyword phrases. Of course, you can structure
it any way that your application requires.

-- Jack Krupansky

-Original Message- From: Patrice Seyed
Sent: Saturday, March 16, 2013 2:48 AM
To: solr-user@lucene.apache.org
Subject: Re: status 400 on posting json

Hi,

Re:

-
Is there some place I should indicate what parameters are including in
the json objects send? I was able to test books.json without the
error.

Yes, in Solr's schema.xml (under the conf/ directory).  See
http://wiki.apache.org/solr/SchemaXml for more details.

   Erik Hatcher

and:

-

I tried it and I get the same error response! Which is because... I
don't have a field named datasource.

You need to check the Solr schema.xml for the available fields and
then add any fields that your JSON uses that are not already there. Be
sure to shutdown and restart Solr after editing the schema.

I did notice that there is a keywords field, but it is not
multivalued, while you keywords are multivalued.

Or, you can us dynamic fields, such as datasource_s and keywords_ss
(s for string and a second s for multivalued), etc. for your other
fields.

-- Jack Krupansky

-

Thanks very much for these responses.  I'm still 

Re: Problem with DataImportHandler and embedded entities

2013-03-25 Thread Rulian Estivalletti
Did you ever resolve the issue with your full-import only importing 1
document.
I'm monitoring the source db and its only issuing one query, it never
attempts to query for the other documents on the top of the nest.
I'm running into the exact same issue with NO help out there.
Thanks in advance


Solrcloud 4.1 Collection with multiple slices only use

2013-03-25 Thread Chris R
I have two issues and I'm unsure if they are related:

Problem:  After setting up a multiple collection Solrcloud 4.1 instance on
seven servers, when I index the documents they aren't distributed across
the index slices.  It feels as though, I don't actually have a cloud
implementation, yet everything I see in the admin interface and zookeeper
implies I do.  I feel as I'm overlooking something obvious, but have not
been able to figure out what.

Configuration: Seven servers and four collections, each with 12 slices (no
replica shards yet).  Zookeeper configured in a three node ensemble.  When
I send documents to Server1/Collection1 (which holds two slices of
collection1), all the documents show up in a single index shard (core).
 Perhaps related, I have found it impossible to get Solr to recognize the
server names with anything but a literal host=servername parameter in the
solr.xml.  hostname parameters, host files, network, dns, are all
configured correctly

I have a Solr 4.0 single collection set up similarly and it works just
fine.  I'm using the same schema.xml and solrconfig.xml files on the 4.1
implementation with only the luceneMatchVersion changed to LUCENE_41.

sample solr.xml from server1

?xml version=1.0 encoding=UTF-8 ?
solr persistent=true
cores adminPath=/admin/cores hostPort=8080 host=server1
shareSchema=true zkClientTimeout=6
core collection=col201301 shard=col201301s04
instanceDir=/solr/col201301/col201301s04sh01 name=col201301s04sh01
dataDir=/solr/col201301/col201301s04sh01/data/
core collection=col201301 shard=col201301s11
instanceDir=/solr/col201301/col201301s11sh01 name=col201301s11sh01
dataDir=/solr/col201301/col201301s11sh01/data/
core collection=col201302 shard=col201302s06
instanceDir=/solr/col201302/col201302s06sh01 name=col201302s06sh01
dataDir=/solr/col201302/col201302s06sh01/data/
core collection=col201303 shard=col201303s01
instanceDir=/solr/col201303/col201303s01sh01 name=col201303s01sh01
dataDir=/solr/col201303/col201303s01sh01/data/
core collection=col201303 shard=col201303s08
instanceDir=/solr/col201303/col201303s08sh01 name=col201303s08sh01
dataDir=/solr/col201303/col201303s08sh01/data/
core collection=col201304 shard=col201304s03
instanceDir=/solr/col201304/col201304s03sh01 name=col201304s03sh01
dataDir=/solr/col201304/col201304s03sh01/data/
core collection=col201304 shard=col201304s10
instanceDir=/solr/col201304/col201304s10sh01 name=col201304s10sh01
dataDir=/solr/col201304/col201304s10sh01/data/
/cores
/solr

Thanks
Chris


Re: Any experience with adding documents batch sizes?

2013-03-25 Thread Otis Gospodnetic
Hi,

You'll have to test because there is no general rule that works in all
environments, but from testing this a while back, you will reach the point
of diminishing returns at some point.  You don't mention using
StreamingUpdateSolrServer, so you may want to try that instead:
http://lucene.apache.org/solr/api-3_6_1/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Mar 25, 2013 at 7:06 PM, Benjamin, Roy rbenja...@ebay.com wrote:

 My application is update intensive.  The documents are pretty small, less
 than 1K bytes.

 Just now I'm batching 4K documents with each SolrJ addDocs() call.

 Wondering what I should expect with increasing this batch size?  Say 8K
 docs per update?

 Thanks

 Roy


 Solr 3.6






Re: OutOfMemoryError

2013-03-25 Thread Otis Gospodnetic
Arkadi,

jstat -gcutil -h20 pid 2000 100 also gives useful info about GC and I use
it a lot for quick insight into what is going on with GC.  SPM (see
http://sematext.com/spm/index.html ) may also be worth using.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/






On Mon, Mar 25, 2013 at 11:01 AM, Arkadi Colson ark...@smartbit.be wrote:

 How can I see if GC is actually working? Is it written in the tomcat logs
 as well or will I only see it in the memory graphs?

 BR,
 Arkadi

 On 03/25/2013 03:50 PM, Bernd Fehling wrote:

 We use munin with jmx plugin for monitoring all server and Solr
 installations.
 (http://munin-monitoring.org/)

 Only for short time monitoring we also use jvisualvm delivered with Java
 SE JDK.

 Regards
 Bernd

 Am 25.03.2013 14:45, schrieb Arkadi Colson:

 Thanks for the info!
 I just upgraded java from 6 to 7...
 How exactly do you monitor the memory usage and the affect of the
 garbage collector?


 On 03/25/2013 01:18 PM, Bernd Fehling wrote:

 The of UseG1GC yes,
 but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM
 (1.7.0_07).
 os.arch: amd64
 os.name: Linux
 os.version: 2.6.32.13-0.5-xen

 Only args are -XX:+UseG1GC -Xms16g -Xmx16g.
 Monitoring shows that 16g is a bit high, I might reduce it to 10g or
 12g for the slaves.
 Start is at 5g, runtime is between 6 and 8g with some peaks to 9.5g.
 Single index, 130GByte, 43.5 mio. dokuments.

 Regards,
 Bernd


 Am 25.03.2013 11:55, schrieb Arkadi Colson:

 Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7?
 Any extra options needed?

 Thanks...

 On 03/25/2013 08:34 AM, Arkadi Colson wrote:

 I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m
 as parameters. I also added -XX:+UseG1GC to the java process. But now
 the whole machine crashes! Any idea why?

 Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked
 oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
 Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/
 mems_allowed=0
 Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm:
 java Not tainted 2.6.32-5-amd64 #1
 Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
 Mar 22 20:30:01 solr01-gs kernel: [716098.078155]
 [810b6324] ? oom_kill_process+0x7f/0x23f
 Mar 22 20:30:01 solr01-gs kernel: [716098.078233]
 [810b6848] ? __out_of_memory+0x12a/0x141
 Mar 22 20:30:01 solr01-gs kernel: [716098.078309]
 [810b699f] ? out_of_memory+0x140/0x172
 Mar 22 20:30:01 solr01-gs kernel: [716098.078385]
 [810ba704] ? __alloc_pages_nodemask+0x4ec/**0x5fc
 Mar 22 20:30:01 solr01-gs kernel: [716098.078469]
 [812fb47a] ? io_schedule+0x93/0xb7
 Mar 22 20:30:01 solr01-gs kernel: [716098.078541]
 [810bbc69] ? __do_page_cache_readahead+**0x9b/0x1b4
 Mar 22 20:30:01 solr01-gs kernel: [716098.078626]
 [81064fc0] ? wake_bit_function+0x0/0x23
 Mar 22 20:30:01 solr01-gs kernel: [716098.078702]
 [810bbd9e] ? ra_submit+0x1c/0x20
 Mar 22 20:30:01 solr01-gs kernel: [716098.078773]
 [810b4a72] ? filemap_fault+0x17d/0x2f6
 Mar 22 20:30:01 solr01-gs kernel: [716098.078849]
 [810ca9e2] ? __do_fault+0x54/0x3c3
 Mar 22 20:30:01 solr01-gs kernel: [716098.078921]
 [810ccd36] ? handle_mm_fault+0x3b8/0x80f
 Mar 22 20:30:01 solr01-gs kernel: [716098.078999]
 [8101166e] ? apic_timer_interrupt+0xe/0x20
 Mar 22 20:30:01 solr01-gs kernel: [716098.079078]
 [812febf6] ? do_page_fault+0x2e0/0x2fc
 Mar 22 20:30:01 solr01-gs kernel: [716098.079153]
 [812fca95] ? page_fault+0x25/0x30
 Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU0: hi: 0,
 btch:   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU1: hi: 0,
 btch:   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU2: hi: 0,
 btch:   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU3: hi: 0,
 btch:   1 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32
 per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU0: hi: 186,
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU1: hi: 186,
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU2: hi: 186,
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU3: hi: 186,
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal
 per-cpu:
 Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU0: hi: 186,
 btch:  31 usd:  17
 Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU1: hi: 186,
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU2: hi: 186,
 btch:  31 usd:   2
 Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU3: hi: 186,
 btch:  31 usd:   0
 Mar 22 20:30:01 solr01-gs kernel: [716098.080209] 

Re: status 400 on posting json

2013-03-25 Thread Jack Krupansky
Your schema has only fields, but no field types. Check the Solr example 
schema for reference, and include all of the types defined there unless you 
know that you do not need them. string is clearly one that is needed.


-- Jack Krupansky

-Original Message- 
From: Patrice Seyed

Sent: Monday, March 25, 2013 7:19 PM
To: solr-user@lucene.apache.org
Subject: Re: status 400 on posting json

Hi Jack, I tried putting the schema.xml file (further below) in the
path you specified below, but when i tried to start (java -jar
start.jar) got the message below.

I can try a fresh install like you suggested, but I'm not sure what
would be different. I was using documenationt at
http://lucene.apache.org/solr/4_1_0/tutorial.html using the binary
from zip. Are you suggesting building from source and/or some other
approach? Also, what is the best documentation currently for 4.1
install (for mac), (there are a lot of sites out there.) Thanks in
advance. -Patrice

SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: Unknown fieldtype 'string'
specified on field id
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:390)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)

at java.lang.Thread.run(Thread.java:680)
Mar 25, 2013 7:14:53 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Unable to create
core: collection1
at 
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1654)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1039)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)

at java.lang.Thread.run(Thread.java:680)
Caused by: org.apache.solr.common.SolrException: Unknown fieldtype
'string' specified on field id
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:390)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
... 10 more


---

Here's the normal path to the example configuration in Solr 4.1:

.../solr-4.1.0/example/solr/collection1/conf

That's the directory in which the example schema.xml and other
configuration files live.

There is no solr-4.1.0/example/conf directory, unless you managed to
create one yourself.

I suggest that you start with a fresh install of Solr 4.1

As far as keywords, the existing field is set up to be a
comma-separated list of keyword phrases. Of course, you can structure
it any way that your application requires.

-- Jack Krupansky

-Original Message- From: Patrice Seyed
Sent: Saturday, March 16, 2013 2:48 AM
To: solr-user@lucene.apache.org
Subject: Re: status 400 on posting json

Hi,

Re:

-
Is there some place I should indicate what parameters are including in
the json objects send? I was able to test books.json without the
error.

Yes, in Solr's schema.xml (under the conf/ directory).  See
http://wiki.apache.org/solr/SchemaXml for more details.

  Erik Hatcher

and:

-

I tried it and I get the same error response! Which is because... I
don't have a field named datasource.

You need to check the Solr schema.xml for the available fields and
then add any fields that your JSON uses that are not already there. Be
sure to shutdown and restart Solr after editing the schema.

I did notice that there is a keywords field, but it is not
multivalued, while you keywords are multivalued.

Or, you can us dynamic fields, such as datasource_s and keywords_ss
(s for string 

Re: Using Solr For a Real Search Engine

2013-03-25 Thread Otis Gospodnetic
Hi,

This question is too open-ended for anyone to give you a good answer.
 Maybe you want to ask more specific questions?  As for embedding vs. war,
start with a simpler war and think about the alternatives if that doesn't
work for you.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Fri, Mar 22, 2013 at 8:07 AM, Furkan KAMACI furkankam...@gmail.comwrote:

 If I want to use Solr in a web search engine what kind of strategies should
 I follow about how to run Solr. I mean I can run it via embedded jetty or
 use war and deploy to a container? You should consider that I will have
 heavy work load on my Solr.



Re: Solrcloud 4.1 Collection with multiple slices only use

2013-03-25 Thread Mark Miller
I'm guessing you didn't specify numShards. Things changed in 4.1 - if you don't 
specify numShards it goes into a mode where it's up to you to distribute 
updates.

- Mark

On Mar 25, 2013, at 10:29 PM, Chris R corg...@gmail.com wrote:

 I have two issues and I'm unsure if they are related:
 
 Problem:  After setting up a multiple collection Solrcloud 4.1 instance on
 seven servers, when I index the documents they aren't distributed across
 the index slices.  It feels as though, I don't actually have a cloud
 implementation, yet everything I see in the admin interface and zookeeper
 implies I do.  I feel as I'm overlooking something obvious, but have not
 been able to figure out what.
 
 Configuration: Seven servers and four collections, each with 12 slices (no
 replica shards yet).  Zookeeper configured in a three node ensemble.  When
 I send documents to Server1/Collection1 (which holds two slices of
 collection1), all the documents show up in a single index shard (core).
 Perhaps related, I have found it impossible to get Solr to recognize the
 server names with anything but a literal host=servername parameter in the
 solr.xml.  hostname parameters, host files, network, dns, are all
 configured correctly
 
 I have a Solr 4.0 single collection set up similarly and it works just
 fine.  I'm using the same schema.xml and solrconfig.xml files on the 4.1
 implementation with only the luceneMatchVersion changed to LUCENE_41.
 
 sample solr.xml from server1
 
 ?xml version=1.0 encoding=UTF-8 ?
 solr persistent=true
 cores adminPath=/admin/cores hostPort=8080 host=server1
 shareSchema=true zkClientTimeout=6
 core collection=col201301 shard=col201301s04
 instanceDir=/solr/col201301/col201301s04sh01 name=col201301s04sh01
 dataDir=/solr/col201301/col201301s04sh01/data/
 core collection=col201301 shard=col201301s11
 instanceDir=/solr/col201301/col201301s11sh01 name=col201301s11sh01
 dataDir=/solr/col201301/col201301s11sh01/data/
 core collection=col201302 shard=col201302s06
 instanceDir=/solr/col201302/col201302s06sh01 name=col201302s06sh01
 dataDir=/solr/col201302/col201302s06sh01/data/
 core collection=col201303 shard=col201303s01
 instanceDir=/solr/col201303/col201303s01sh01 name=col201303s01sh01
 dataDir=/solr/col201303/col201303s01sh01/data/
 core collection=col201303 shard=col201303s08
 instanceDir=/solr/col201303/col201303s08sh01 name=col201303s08sh01
 dataDir=/solr/col201303/col201303s08sh01/data/
 core collection=col201304 shard=col201304s03
 instanceDir=/solr/col201304/col201304s03sh01 name=col201304s03sh01
 dataDir=/solr/col201304/col201304s03sh01/data/
 core collection=col201304 shard=col201304s10
 instanceDir=/solr/col201304/col201304s10sh01 name=col201304s10sh01
 dataDir=/solr/col201304/col201304s10sh01/data/
 /cores
 /solr
 
 Thanks
 Chris



Re: opinion: Stats over the faceting component

2013-03-25 Thread Otis Gospodnetic
Nope, this doesn't find it:
http://search-lucene.com/?q=facet+statsfc_project=Solrfc_type=issue

Maybe Anirudha wants to do that?

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Thu, Mar 21, 2013 at 5:16 AM, Upayavira u...@odoko.co.uk wrote:

 Have you made a JIRA ticket for this? This is useful generally, isn't
 it?

 Thx, Upayavira

 On Thu, Mar 21, 2013, at 03:18 AM, Tirthankar Chatterjee wrote:
  We have done something similar.
  Please read
 
 http://lucene.472066.n3.nabble.com/How-to-modify-Solr-StatsComponent-to-support-stats-query-td4028991.html
 
  https://plus.google.com/101157854606139706613/posts/HmYYit3RABM
 
  If this is something you wanted.
 
  On Mar 20, 2013, at 7:08 PM, Anirudha Jadhav wrote:
 
  I want to get an opinion here , instead of having  statistics as an
  independent component which is always limited by faceting features ( eg.
  does not support date ranges or custom ranges , pivots etc).
 
  Why not have a parameter to facet component to compute and return stats.
 
  eg. facet.stats=true,facet.stats.stat=min,max,(sum(sqrt(x),log(y),z,0.5))
 
  let me know your thoughts,
 
  --
  Anirudha P. Jadhav
 
 
 
  **Legal Disclaimer***
  This communication may contain confidential and privileged
  material for the sole use of the intended recipient. Any
  unauthorized review, use or distribution by others is strictly
  prohibited. If you have received the message in error, please
  advise the sender by reply email and delete the message. Thank
  you.
  *



Re: Solr index Backup and restore of large indexs

2013-03-25 Thread Otis Gospodnetic
Hi,

Try something like this: http://host/solr/replication?command=backup

See: http://wiki.apache.org/solr/SolrReplication

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Thu, Mar 21, 2013 at 3:23 AM, Sandeep Kumar Anumalla
sanuma...@etisalat.ae wrote:

 Hi,

 We are loading daily 1TB (Apprx) of index data .Please let me know the best 
 procedure to take Backup and restore of the indexes. I am using Solr 4.2.



 Thanks  Regards
 Sandeep A
 Ext : 02618-2856
 M : 0502493820


 
 The content of this email together with any attachments, statements and 
 opinions expressed herein contains information that is private and 
 confidential are intended for the named addressee(s) only. If you are not the 
 addressee of this email you may not copy, forward, disclose or otherwise use 
 it or any part of it in any form whatsoever. If you have received this 
 message in error please notify postmas...@etisalat.ae by email immediately 
 and delete the message without making any copies.


Re: Shingles Filter Query time behaviour

2013-03-25 Thread Otis Gospodnetic
Hi,

What does your query look like?  Does it look like q=name:dark knight?
 If so, note that only dark is going against the name field.  Try
q=name:dark name:knight or q=name:dark knight.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Mar 18, 2013 at 6:21 PM, Catala, Francois
francois.cat...@nuance.com wrote:
 Hello,

 I am trying to have the input darkknight match documents containing either 
 dark knight and darkknight.
 The reverse should also work (dark knight matching dark knight and 
 darkknight) but it doesn't. Does anyone know why?


 When I run the following query I get the expected response with the two 
 documents matched

 lst name=responseHeader
   int name=status0/int
   int name=QTime1/int
   lst name=params
 str name=flname/str
 str name=indenttrue/str
 str name=qname:darkknight/str
 str name=wtxml/str
   /lst
 /lst
 result name=response numFound=2 start=0
   doc
 str name=nameBatman, the darkknight Rises/str/doc
   doc
 str name=nameBatman, the dark knight Rises/str/doc
 /result
 /response


 HOWEVER when I run the same query looking for dark knight two words I get 
 only 1 document matched as shows the response :

 lst name=responseHeader
   int name=status0/int
   int name=QTime0/int
   lst name=params
 str name=flname/str
 str name=indenttrue/str
 str name=qname:dark knight/str
 str name=wtxml/str
   /lst
 /lst
 result name=response numFound=1 start=0
   doc
 str name=nameBatman, the dark knight Rises/str/doc
 /result
 /response

 I have these documents as input :

 doc
   field name=idbat1/field
   field name=nameBatman, the dark knight Rises/field
 /doc
 doc
   field name=idbat2/field
   field name=nameBatman, the darkknight Rises/field
 /doc

 And I defined this analyser :

   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.ShingleFilterFactory
 tokenSeparator=
 outputUnigrams=true/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.ShingleFilterFactory
 tokenSeparator=
 outputUnigrams=true
 outputUnigramIfNoNgrams=true/
   /analyzer


Re: Shingles Filter Query time behaviour

2013-03-25 Thread Jack Krupansky

Or, q=name:(dark knight) .

-- Jack Krupansky

-Original Message- 
From: Otis Gospodnetic

Sent: Monday, March 25, 2013 11:51 PM
To: solr-user@lucene.apache.org
Subject: Re: Shingles Filter Query time behaviour

Hi,

What does your query look like?  Does it look like q=name:dark knight?
If so, note that only dark is going against the name field.  Try
q=name:dark name:knight or q=name:dark knight.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Mar 18, 2013 at 6:21 PM, Catala, Francois
francois.cat...@nuance.com wrote:

Hello,

I am trying to have the input darkknight match documents containing 
either dark knight and darkknight.
The reverse should also work (dark knight matching dark knight and 
darkknight) but it doesn't. Does anyone know why?



When I run the following query I get the expected response with the two 
documents matched


lst name=responseHeader
  int name=status0/int
  int name=QTime1/int
  lst name=params
str name=flname/str
str name=indenttrue/str
str name=qname:darkknight/str
str name=wtxml/str
  /lst
/lst
result name=response numFound=2 start=0
  doc
str name=nameBatman, the darkknight Rises/str/doc
  doc
str name=nameBatman, the dark knight Rises/str/doc
/result
/response


HOWEVER when I run the same query looking for dark knight two words I 
get only 1 document matched as shows the response :


lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
  lst name=params
str name=flname/str
str name=indenttrue/str
str name=qname:dark knight/str
str name=wtxml/str
  /lst
/lst
result name=response numFound=1 start=0
  doc
str name=nameBatman, the dark knight Rises/str/doc
/result
/response

I have these documents as input :

doc
  field name=idbat1/field
  field name=nameBatman, the dark knight Rises/field
/doc
doc
  field name=idbat2/field
  field name=nameBatman, the darkknight Rises/field
/doc

And I defined this analyser :

  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ShingleFilterFactory
tokenSeparator=
outputUnigrams=true/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ShingleFilterFactory
tokenSeparator=
outputUnigrams=true
outputUnigramIfNoNgrams=true/
  /analyzer 




Re: Query slow with termVectors termPositions termOffsets

2013-03-25 Thread Ravi Solr
Yes the index size increased after turning on termPositions and termOffsets

Ravi Kiran Bhaskar

On Mon, Mar 25, 2013 at 1:13 PM, alx...@aim.com wrote:

 Did index size increase after turning on termPositions and termOffsets?

 Thanks.
 Alex.







 -Original Message-
 From: Ravi Solr ravis...@gmail.com
 To: solr-user solr-user@lucene.apache.org
 Sent: Mon, Mar 25, 2013 8:27 am
 Subject: Query slow with termVectors termPositions termOffsets


 Hello,
 We re-indexed our entire core of 115 docs with some of the
 fields having termVectors=true termPositions=true termOffsets=true,
 prior to the reindex we only had termVectors=true. After the reindex the
 the query component has become very slow. I thought that adding the
 termOffsets and termPositions will increase the speed, am I wrong ? Several
 queries like the one shown below which used to run fine are now very slow.
 Can somebody kindly clarify how termOffsets and termPositions affect query
 component ?

 lst name=processdouble name=time19076.0/double
  lst name=org.apache.solr.handler.component.QueryComponentdouble
 name=time18972.0/double/lst
 lst name=org.apache.solr.handler.component.FacetComponentdouble
 name=time0.0/double/lst
 lst name=org.apache.solr.handler.component.MoreLikeThisComponentdouble
 name=time0.0/double/lst
 lst name=org.apache.solr.handler.component.HighlightComponentdouble
 name=time0.0/double/lst
 lst name=org.apache.solr.handler.component.StatsComponentdouble
 name=time0.0/double/lst
 lst
 name=org.apache.solr.handler.component.QueryElevationComponentdouble
 name=time0.0/double/lst
 lst name=org.apache.solr.handler.clustering.ClusteringComponentdouble
 name=time0.0/double/lst
 lst name=org.apache.solr.handler.component.DebugComponentdouble
 name=time104.0/double/lst
 /lst



 [#|2013-03-25T11:22:53.446-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=45;_ThreadName=httpSSLWorkerThread-9001-19;|[xxx]
 webapp=/solr-admin path=/select

 params={q=primarysectionnode:(/national*+OR+/health*)+OR+(contenttype:Blog+AND+subheadline:(The+Checkup+OR+Checkpoint+Washington+OR+Post+Carbon+OR+TSA+OR+College+Inc.+OR+Campus+Overload+OR+Planet+Panel+OR+The+Answer+Sheet+OR+Class+Struggle+OR+BlogPost))+OR+(contenttype:Photo+Gallery+AND+headline:day+in+photos)start=0rows=1sort=displaydatetime+descfq=-source:(Reuters+OR+PC+World+OR+CBS+News+OR+NC8/WJLA+OR+NewsChannel+8+OR+NC8+OR+WJLA+OR+CBS)+-contenttype:(Discussion+OR+Photo)+-slug:(op-*dummy*+OR+noipad-*)+-(contenttype:Photo+Gallery+AND+headline:(Drawing+Board+OR+Drawing+board+OR+drawing+board))+headline:[*+TO+*]+contenttype:[*+TO+*]+pubdatetime:[NOW/DAY-3YEARS+TO+NOW/DAY%2B1DAY]+-headline:(Summary+Box*+OR+Video*+OR+Post+Sports+Live*)+-slug:(warren*+OR+history)+-(contenttype:Blog+AND+subheadline:(DC+Schools+Insider+OR+On+Leadership))+contenttype:Blog+-systemid:(999c7102-955a-11e2-95ca-dd43e7ffee9c+OR+72bbb724-9554-11e2-95ca-dd43e7ffee9c+OR+2d008b80-9520-11e2-95ca-dd43e7ffee9c+OR+d2443d3c-9514-11e2-95ca-dd43e7ffee9c+OR+173764d6-9520-11e2-95ca-dd43e7ffee9c+OR+0181fd42-953c-11e2-95ca-dd43e7ffee9c+OR+e6cacb96-9559-11e2-95ca-dd43e7ffee9c+OR+03288052-9501-11e2-95ca-dd43e7ffee9c+OR+ddbf020c-9517-11e2-95ca-dd43e7ffee9c)+fullbody:[*+TO+*]wt=javabinversion=2}
 hits=4985 status=0 QTime=19044 |#]

 Thanks,

 Ravi Kiran Bhaskar





Re: Solrcloud 4.1 Collection with multiple slices only use

2013-03-25 Thread Chris R
Interesting, I saw some comments about numshards, but it wasnt ever
specific enough to catch.my attention.  I will give it a try tomorrow.
Thanks.
On Mar 25, 2013 11:35 PM, Mark Miller markrmil...@gmail.com wrote:

 I'm guessing you didn't specify numShards. Things changed in 4.1 - if you
 don't specify numShards it goes into a mode where it's up to you to
 distribute updates.

 - Mark

 On Mar 25, 2013, at 10:29 PM, Chris R corg...@gmail.com wrote:

  I have two issues and I'm unsure if they are related:
 
  Problem:  After setting up a multiple collection Solrcloud 4.1 instance
 on
  seven servers, when I index the documents they aren't distributed across
  the index slices.  It feels as though, I don't actually have a cloud
  implementation, yet everything I see in the admin interface and zookeeper
  implies I do.  I feel as I'm overlooking something obvious, but have not
  been able to figure out what.
 
  Configuration: Seven servers and four collections, each with 12 slices
 (no
  replica shards yet).  Zookeeper configured in a three node ensemble.
  When
  I send documents to Server1/Collection1 (which holds two slices of
  collection1), all the documents show up in a single index shard (core).
  Perhaps related, I have found it impossible to get Solr to recognize the
  server names with anything but a literal host=servername parameter in
 the
  solr.xml.  hostname parameters, host files, network, dns, are all
  configured correctly
 
  I have a Solr 4.0 single collection set up similarly and it works just
  fine.  I'm using the same schema.xml and solrconfig.xml files on the 4.1
  implementation with only the luceneMatchVersion changed to LUCENE_41.
 
  sample solr.xml from server1
 
  ?xml version=1.0 encoding=UTF-8 ?
  solr persistent=true
  cores adminPath=/admin/cores hostPort=8080 host=server1
  shareSchema=true zkClientTimeout=6
  core collection=col201301 shard=col201301s04
  instanceDir=/solr/col201301/col201301s04sh01 name=col201301s04sh01
  dataDir=/solr/col201301/col201301s04sh01/data/
  core collection=col201301 shard=col201301s11
  instanceDir=/solr/col201301/col201301s11sh01 name=col201301s11sh01
  dataDir=/solr/col201301/col201301s11sh01/data/
  core collection=col201302 shard=col201302s06
  instanceDir=/solr/col201302/col201302s06sh01 name=col201302s06sh01
  dataDir=/solr/col201302/col201302s06sh01/data/
  core collection=col201303 shard=col201303s01
  instanceDir=/solr/col201303/col201303s01sh01 name=col201303s01sh01
  dataDir=/solr/col201303/col201303s01sh01/data/
  core collection=col201303 shard=col201303s08
  instanceDir=/solr/col201303/col201303s08sh01 name=col201303s08sh01
  dataDir=/solr/col201303/col201303s08sh01/data/
  core collection=col201304 shard=col201304s03
  instanceDir=/solr/col201304/col201304s03sh01 name=col201304s03sh01
  dataDir=/solr/col201304/col201304s03sh01/data/
  core collection=col201304 shard=col201304s10
  instanceDir=/solr/col201304/col201304s10sh01 name=col201304s10sh01
  dataDir=/solr/col201304/col201304s10sh01/data/
  /cores
  /solr
 
  Thanks
  Chris




Re: Scaling Solr on VMWare

2013-03-25 Thread Otis Gospodnetic
Hi Frank,

If your servlet container had a crazy low setting for the max number
of threads I think you would see the CPU underutilized.  But I think
you would also see errors in on the client about connections being
requested.  Sounds like a possibly VM issue that's not
Solr-specific...

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl
frank.wennerd...@arcadelia.com wrote:
 Hi.



 We are currently benchmarking our Solr setup and are having trouble with
 scaling hardware for a single Solr instance. We want to investigate how one
 instance scales with hardware to find the optimal ratio of hardware vs
 sharding when scaling. Our main problem is that we cannot identify any
 hardware limitations, CPU is far from maxed out, disk I/O is not an issue as
 far as we can see and there is plenty of RAM available.



 In short we have a couple of questions that we hope someone here could help
 us with. Detailed information about our setup, use case and things we've
 tried is provided below the questions.



 Questions:

 1.   What could cause Solr to utilize only 2 CPU cores when sending
 multiple update requests in parallel in a VMWare environment?

 2.   Is there a software limit on the number of CPU cores that Solr can
 utilize while indexing?

 3.   Ruling out network and disk performance, what could cause a
 decrease in indexing speed when sending data over a network as opposed to
 sending it from the local machine?



 We are running on three cores per Solr instance, however only one core
 receives any non-trivial load. We are using VMWare (ESX 5.0) virtual
 machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 setup
 for storage. Our data consists of a huge amount of small-sized documents.
 When indexing we are using Solr's javabin format (although not through
 Solrj, we have implemented the format in C#/.NET) and our batch size is
 currently 1000 documents. The actual size of the data varies, but the
 batches we have used range from approximately 450KB to 1050KB. We're sending
 these batches to Solr in parallel using a number of send threads.



 There are two issues that we've run into:

 1.   When sending data from one VM to Solr on another VM we observed
 that Solr did not seem to utilize CPU cores properly. The Solr VM had 8
 vCPUs available and we were using 4 threads sending data in parallel. We saw
 a low (~29%)  CPU utilization on the Solr VM with 2 cores doing almost all
 the work while the remaining cores remained almost idle. Increasing the
 number of send threads to 8 yielded the same result, capping our indexing
 speed to about 4.88MB per second. The client VM had 4 vCPUs which were
 hardly utilized as we were reading data from pre-generated files.

 To rule out network limitations we sent the test data to a server on the
 Solr VM that simply accepted the request and returned an empty response. We
 were able to send data at 219MB per second, so the network did not seem to
 be the bottleneck. We also tested sending data to Solr locally from the Solr
 VM to see if disk I/O was the problem. Surprisingly we were able to index
 significantly faster at 7.34MB per second using 4 send threads (8.4MB with 6
 send threads) which indicated that the disk was not slowing us down when
 sending data over the network. Worth noting is that the CPU utilization was
 now higher (47,81% with 4 threads, 58,8% with 6) and the work was spread out
 over all cores. As before we used pre-generated files and the process
 sending the data used almost no CPU.

 2.   We decided to investigate how Solr would scale with additional
 vCPUs when indexing locally. We increased the number of vCPUs to 16 and the
 number of send threads to 8. Sadly we now experienced a decrease in
 performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s with
 16 threads. The CPU usage was in average 30%, regardless of the number of
 threads used. We know that additional vCPUs can cause decreased performance
 in VMWare virtual machines due to time waiting for CPUs to become available.
 We investigated this using esxtop which only showed a 1% CSTP. According to
 VMWare
 http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=di
 splayKCexternalId=1005362  a CSTP above 3% could indictate that multiple
 vCPUs are causing performance issues.

 We noticed that the average disk write speed seemed to cap at around 11.5
 million bytes per second so we tested the same VM setup using a faster disk.
 This did not yield any increase in performance (it was actually somewhat
 slower), neither did using a RAM-mapped drive for Solr.



 Any help or ideas of what could be the bottleneck in our setup would be
 greatly appreciated!



 Best regards,

 Frank Wennerdahl

 Developer

 Arcadelia AB



RE: Slow queries for common terms

2013-03-25 Thread David Parks
book by itself returns in 4s (non-optimized disk IO), running it a second
time returned 0s, so I think I can presume that the query was not cached the
first time. This system has been up for week, so it's warm.

I'm going to give your article a good long read, thanks for that.   

I guess good fast disks/SSDs and sharding should also improve on the base 4
sec query time. How _does_ Google get their queries times down to 0.35s
anyway? I presume their indexes are larger than my 150G index. :)

I still am a bit worried about what will happen when my index is 500GB
(it'll happen soon enough), even with sharding... well... I'd just need a
lot of servers it seems, and my feeling of it is that if I need a lot of
servers for a few users, how will it scale to many users?

Thanks for the great discussion,
Dave


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 25, 2013 10:04 PM
To: solr-user@lucene.apache.org
Subject: Re: Slow queries for common terms

take a look here:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

looking at memory consumption can be a bit tricky to interpret with
MMapDirectory.

But you say I see the CPU working very hard which implies that your issue
is just scoring 90M documents. A way to test: try q=*:*fq=field:book. My
bet is that that will be much faster, in which case scoring is your
choke-point and you'll need to spread that load across more servers, i.e.
shard.

When running the above, make sure of a couple of things:
1 you haven't run the fq query before (or you have filterCache turned
completely off).
2 you _have_ run a query or two that warms up your low-level caches.
Doesn't matter what, just as long as it doesn't have an fq clause.

Best
Erick



On Sat, Mar 23, 2013 at 3:10 AM, David Parks davidpark...@yahoo.com wrote:

 I see the CPU working very hard, and at the same time I see 2 MB/sec 
 disk access for that 15 seconds. I am not running it this instant, but 
 it seems to me that there was more CPU cycles available, so unless 
 it's an issue of not being able to multithread it any  further I'd say
it's more IO related.

 I'm going to set up solr cloud and shard across the 2 servers I have 
 available for now. It's not an optimal setup we have while we're in a 
 private beta period, but maybe it'll improve things (I've got 2 
 servers with 2x 4TB disks in raid-0 shared with the webservers).

 I'll work towards some improved IO performance and maybe more shards 
 and see how things go. I'll also be able to up the RAM in just a 
 couple of weeks.

 Are there any settings I should think of in terms of improving cache 
 performance when I can give it say 10GB of RAM?

 Thanks, this has been tremendously helpful.

 David


 -Original Message-
 From: Tom Burton-West [mailto:tburt...@umich.edu]
 Sent: Saturday, March 23, 2013 1:38 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Slow queries for common terms

 Hi David and Jan,

 I wrote the blog post, and David, you are right, the problem we had 
 was with phrase queries because our positions lists are so huge.  
 Boolean
 queries don't need to read the positions lists.   I think you need to
 determine whether you are CPU bound or I/O bound.It is possible that
 you are I/O bound and reading the term frequency postings for 90 
 million docs is taking a long time.  In that case, More memory in the 
 machine (but not dedicated to Solr) might help because Solr relies on 
 OS disk caching for caching the postings lists.  You would still need 
 to do some cache warming with your most common terms.

 On the other hand as Jan pointed out, you may be cpu bound because 
 Solr doesn't have early termination and has to rank all 90 million 
 docs in order to show the top 10 or 25.

 Did you try the OR search to see if your CPU is at 100%?

 Tom

 On Fri, Mar 22, 2013 at 10:14 AM, Jan Høydahl jan@cominvent.com
 wrote:

  Hi
 
  There might not be a final cure with more RAM if you are CPU bound.
  Scoring 90M docs is some work. Can you check what's going on during 
  those
  15 seconds? Is your CPU at 100%? Try an (foo OR bar OR baz) search 
  which generates 100mill hits and see if that is slow too, even if 
  you don't use frequent words.
 
  I'm sure you can find other frequent terms in your corpus which 
  display similar behaviour, words which are even more frequent than 
  book. Are you using AND as default operator? You will benefit 
  from limiting the number of results as much as possible.
 
  The real solution is to shard across N number of servers, until you 
  reach the desired performance for the desired indexing/querying load.
 
  --
  Jan Høydahl, search solution architect Cominvent AS - 
  www.cominvent.com Solr Training - www.solrtraining.com