Re: no error delta fail with DataImportHandler

2009-12-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
the deltaQuery select 'product_id' and your deltaImportQuery uses
${dataimporter.delta.id}
I guess it should have been ${dataimporter.delta. product_id}

On Wed, Dec 2, 2009 at 11:52 PM, Thomas Woodard gtfo...@hotmail.com wrote:

 I'm trying to get delta indexing set up. My configuration allows a full index 
 no problem, but when I create a test delta of a single record, the delta 
 import finds the record but then does nothing. I can only assume I have 
 something subtly wrong with my configuration, but according to the wiki, my 
 configuration should be valid. What I am trying to do is have a single delta 
 detected on the top level entity trigger a rebuild of everything under that 
 entity, the same as the first example in the wiki. Any help would be greatly 
 appreciated.

 dataConfig
    dataSource name=prodcat driver=oracle.jdbc.driver.OracleDriver 
 url=jdbc:oracle:oci:@XXX
    user=XXX password=XXX autoCommit=false 
 transactionIsolation=TRANSACTION_READ_COMMITTED/

    document
        entity name=product dataSource=prodcat query=
        select dp.product_id, dp.display_name, dp.long_description, 
 gp.orientation
        from dcs_product dp, gl_product gp
        where dp.product_id = gp.product_id 
 transformer=ClobTransformer,HTMLStripTransformer
        deltaImportQuery=select dp.product_id, dp.display_name, 
 dp.long_description, gp.orientation
        from dcs_product dp, gl_product gp
        where dp.product_id = gp.product_id
        AND dp.product_id = '${dataimporter.delta.id}'
        deltaQuery=select product_id from gl_product_modified where 
 last_modified  TO_DATE('${dataimporter.last_index_time}', '-mm-dd 
 hh:mi:ss')
        rootEntity=false
        pk=PRODUCT_ID
            !-- COLUMN NAMES ARE CASE SENSITIVE. THEY NEED TO BE ALL CAPS OR 
 EVERYTHING FAILS --
            field column=PRODUCT_ID name=product_id/
            field column=DISPLAY_NAME name=name/
            field column=LONG_DESCRIPTION name=long_description 
 clob=true stripHTML=true /
            field column=ORIENTATION name=orientation/

            entity name=sku dataSource=prodcat query=select ds.sku_id, 
 ds.sku_type, ds.on_sale, '${product.PRODUCT_ID}' || '_' || ds.sku_id as 
 unique_id
        from dcs_prd_chldsku dpc, dcs_sku ds
        where dpc.product_id = '${product.PRODUCT_ID}'
        and dpc.sku_id = ds.sku_id
        rootEntity=true pk=PRODUCT_ID, SKU_ID
                field column=SKU_ID name=sku_id/
                field column=SKU_TYPE name=sku_type/
                field column=ON_SALE name=on_sale/
                field column=UNIQUE_ID name=unique_id/

                entity name=catalog dataSource=prodcat query=select 
 pc.catalog_id
                            from gl_prd_catalog pc, gl_sku_catalog sc
                            where pc.product_id = '${product.PRODUCT_ID}' and 
 sc.sku_id = '${sku.SKU_ID}' and pc.catalog_id = sc.catalog_id pk=SKU_ID, 
 CATALOG_ID
                        field column=CATALOG_ID name=catalogs/
                /entity

                entity name=price dataSource=prodcat query=select 
 ds.list_price as price
                            from dcs_sku ds
                            where ds.sku_id = '${sku.SKU_ID}'
                            and ds.on_sale = 0
                            UNION
                            select ds.sale_price as price
                            from dcs_sku ds
                            where ds.sku_id = '${sku.SKU_ID}'
                            and ds.on_sale = 1
                            pk=SKU_ID
                        field column=PRICE name=price/
                /entity
            /entity

            entity name=studio dataSource=prodcat query=select gs.name 
 from gl_product_studio gps, gl_studio gs where gps.studio_id = gs.studio_id 
 and gps.product_id = '${product.PRODUCT_ID}' rootEntity=false 
 pk=PRODUCT_ID
                field column=NAME name=studio/
            /entity

            entity name=star dataSource=prodcat query=select gc.name 
 from gl_contributor gc, gl_product_contributor gpc
                where gc.contributor_id = gpc.contributor_id and 
 gpc.product_id = '${product.PRODUCT_ID}' rootEntity=false pk=PRODUCT_ID, 
 CONTRIBUTOR_ID
                field column=NAME name=stars/
            /entity

            entity name=director dataSource=prodcat query=select gc.name 
 from gl_contributor gc, gl_product_director gpd
                where gc.contributor_id = gpd.contributor_id and 
 gpd.product_id = '${product.PRODUCT_ID}' rootEntity=false pk=PRODUCT_ID, 
 CONTRIBUTOR_ID
                field column=NAME name=directors/
            /entity

            entity name=keyword dataSource=prodcat query=select 
 dcs_category.display_name as keyword_name
                        from dcs_cat_chldprd, dcs_category, gl_category
                        where gl_category.availability = 0
                        and gl_category.exclude_in_vivisimo = 0
                        and 

Re: getting value from parent query in subquery transformer

2009-12-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
you do not need to pass the values as shown here. Make use of the
Context parameter (second implicit parameter) to get hold of the value
of ${item.category}

context.getVariableResolver().resolve('item.category'))

On Wed, Dec 2, 2009 at 7:20 PM, Joel Nylund jnyl...@yahoo.com wrote:
 Hi, I have an entity that has a entity within it that executes a query for
 each row and calls a transformer. Is there a way to pass a value from the
 parent query into the transformer?

 For example, I have an entity called document, and it it has an ID and
 sometimes it has a category.

 I have a sub entity called category that does another complex query using
 the documents ID to get data to send to the transformer to determine the
 category. I would like to pass the parents category to this transformer, so
 I dont have to join in data I already have. Is this possible?

 Im using ${item.id} in the where clause, so I guess im wondering, can I do
 something like.

 entity name=item query=..
        entity name=category
 transformer=script:SplitAndPrettyCategory(${item.category}) query=..

 thanks
 Joel




-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: no error delta fail with DataImportHandler

2009-12-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
probably you can try out this
http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta

and it may give you more info on what is happeining

On Thu, Dec 3, 2009 at 10:58 PM, Thomas Woodard gtfo...@hotmail.com wrote:

 Unfortunately that isn't it. I have tried id, product_id, and PRODUCT_ID, and 
 they all produce the same result. It finds the modified item, but then does 
 nothing.

 INFO: Running ModifiedRowKey() for Entity: product
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.JdbcDataSource$1 
 call
 INFO: Creating a connection for entity product with URL: 
 jdbc:oracle:oci:@dev.eline.com
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.JdbcDataSource$1 
 call
 INFO: Time taken for getConnection(): 283
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder 
 collectDelta
 INFO: Completed ModifiedRowKey for Entity: product rows obtained : 1
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder 
 collectDelta
 INFO: Completed DeletedRowKey for Entity: product rows obtained : 0
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder 
 collectDelta
 INFO: Completed parentDeltaQuery for Entity: product
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder doDelta
 INFO: Delta Import completed successfully
 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder execute
 INFO: Time taken = 0:0:0.404


 From: noble.p...@corp.aol.com
 Date: Thu, 3 Dec 2009 12:50:15 +0530
 Subject: Re: no error delta fail with DataImportHandler
 To: solr-user@lucene.apache.org

 the deltaQuery select 'product_id' and your deltaImportQuery uses
 ${dataimporter.delta.id}
 I guess it should have been ${dataimporter.delta. product_id}

 On Wed, Dec 2, 2009 at 11:52 PM, Thomas Woodard gtfo...@hotmail.com wrote:
 
  I'm trying to get delta indexing set up. My configuration allows a full 
  index no problem, but when I create a test delta of a single record, the 
  delta import finds the record but then does nothing. I can only assume I 
  have something subtly wrong with my configuration, but according to the 
  wiki, my configuration should be valid. What I am trying to do is have a 
  single delta detected on the top level entity trigger a rebuild of 
  everything under that entity, the same as the first example in the wiki. 
  Any help would be greatly appreciated.
 
  dataConfig
  á ádataSource name=prodcat driver=oracle.jdbc.driver.OracleDriver 
  url=jdbc:oracle:oci:@XXX
  á áuser=XXX password=XXX autoCommit=false 
  transactionIsolation=TRANSACTION_READ_COMMITTED/
 
  á ádocument
  á á á áentity name=product dataSource=prodcat query=
  á á á áselect dp.product_id, dp.display_name, dp.long_description, 
  gp.orientation
  á á á áfrom dcs_product dp, gl_product gp
  á á á áwhere dp.product_id = gp.product_id 
  transformer=ClobTransformer,HTMLStripTransformer
  á á á ádeltaImportQuery=select dp.product_id, dp.display_name, 
  dp.long_description, gp.orientation
  á á á áfrom dcs_product dp, gl_product gp
  á á á áwhere dp.product_id = gp.product_id
  á á á áAND dp.product_id = '${dataimporter.delta.id}'
  á á á ádeltaQuery=select product_id from gl_product_modified where 
  last_modified  TO_DATE('${dataimporter.last_index_time}', '-mm-dd 
  hh:mi:ss')
  á á á árootEntity=false
  á á á ápk=PRODUCT_ID
  á á á á á á!-- COLUMN NAMES ARE CASE SENSITIVE. THEY NEED TO BE ALL CAPS 
  OR EVERYTHING FAILS --
  á á á á á áfield column=PRODUCT_ID name=product_id/
  á á á á á áfield column=DISPLAY_NAME name=name/
  á á á á á áfield column=LONG_DESCRIPTION name=long_description 
  clob=true stripHTML=true /
  á á á á á áfield column=ORIENTATION name=orientation/
 
  á á á á á áentity name=sku dataSource=prodcat query=select 
  ds.sku_id, ds.sku_type, ds.on_sale, '${product.PRODUCT_ID}' || '_' || 
  ds.sku_id as unique_id
  á á á áfrom dcs_prd_chldsku dpc, dcs_sku ds
  á á á áwhere dpc.product_id = '${product.PRODUCT_ID}'
  á á á áand dpc.sku_id = ds.sku_id
  á á á árootEntity=true pk=PRODUCT_ID, SKU_ID
  á á á á á á á áfield column=SKU_ID name=sku_id/
  á á á á á á á áfield column=SKU_TYPE name=sku_type/
  á á á á á á á áfield column=ON_SALE name=on_sale/
  á á á á á á á áfield column=UNIQUE_ID name=unique_id/
 
  á á á á á á á áentity name=catalog dataSource=prodcat query=select 
  pc.catalog_id
  á á á á á á á á á á á á á áfrom gl_prd_catalog pc, gl_sku_catalog sc
  á á á á á á á á á á á á á áwhere pc.product_id = '${product.PRODUCT_ID}' 
  and sc.sku_id = '${sku.SKU_ID}' and pc.catalog_id = sc.catalog_id 
  pk=SKU_ID, CATALOG_ID
  á á á á á á á á á á á áfield column=CATALOG_ID name=catalogs/
  á á á á á á á á/entity
 
  á á á á á á á áentity name=price dataSource=prodcat query=select 
  ds.list_price as price
  á á á á á á á á á á á á á áfrom dcs_sku ds
  á á á á á á á á á á á á á áwhere ds.sku_id = '${sku.SKU_ID}'
  á á á á á á á á á á á á á áand ds.on_sale = 0
  á á á á á á á á á á á á á áUNION
  á á á á á á á á á á á á á áselect 

Re: Exception encountered during replication on slave....Any clues?

2009-12-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
are you able to hit the
http://localhost:8080/postingsmaster/replication using a browser from
the slave box. if you are able to hit it what do you see?


On Tue, Dec 8, 2009 at 3:42 AM, William Pierce evalsi...@hotmail.com wrote:
 Just to make doubly sure,  per tck's suggestion,  I went in and explicitly
 added in the port in the masterurl so that it now reads:

 http://localhost:8080/postingsmaster/replication

 Still getting the same exception...

 I am running solr 1.4, on Ubuntu karmic, using tomcat 6 and Java 1.6.

 Thanks,

 - Bill

 --
 From: William Pierce evalsi...@hotmail.com
 Sent: Monday, December 07, 2009 2:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Exception encountered during replication on slaveAny clues?

 tck,

 thanks for your quick response.  I am running on the default port (8080).
 If I copy that exact string given in the masterUrl and execute it in the
 browser I get a response from solr:

 ?xml version=1.0 encoding=UTF-8 ?
 - response
 - lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
  /lst
  str name=statusOK/str
  str name=messageNo command/str
  /response

 So the masterUrl is reachable/accessible so far as I am able to tell

 Thanks,

 - Bill

 --
 From: TCK moonwatcher32...@gmail.com
 Sent: Monday, December 07, 2009 1:50 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Exception encountered during replication on slaveAny
 clues?

 are you missing the port number in the master's url ?

 -tck



 On Mon, Dec 7, 2009 at 4:44 PM, William Pierce
 evalsi...@hotmail.comwrote:

 Folks:

 I am seeing this exception in my logs that is causing my replication to
 fail.    I start with  a clean slate (empty data directory).  I index
 the
 data on the postingsmaster using the dataimport handler and it succeeds.
  When the replication slave attempts to replicate it encounters this
 error.

 Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller
 fetchLatestIndex
 SEVERE: Master at: http://localhost/postingsmaster/replication is not
 available. Index fetch failed. Exception: Invalid version or the data in
 not
 in 'javabin' format

 Any clues as to what I should look for to debug this further?

 Replication is enabled as follows:

 The postingsmaster solrconfig.xml looks as follows:

 requestHandler name=/replication class=solr.ReplicationHandler 
   lst name=master
     !--Replicate on 'optimize' it can also be  'commit' --
     str name=replicateAftercommit/str
     !--If configuration files need to be replicated give the names here
 .
 comma separated --
     str name=confFiles/str
   /lst
  /requestHandler

 The postings slave solrconfig.xml looks as follows:

 requestHandler name=/replication class=solr.ReplicationHandler 
   lst name=slave
       !--fully qualified url for the replication handler of master --
       str name=masterUrlhttp://localhost/postingsmaster/replication
 /str
       !--Interval in which the slave should poll master .Format is
 HH:mm:ss . If this is absent slave does not poll automatically.
        But a snappull can be triggered from the admin or the http API
 --
       str name=pollInterval00:05:00/str
    /lst
  /requestHandler


 Thanks,

 - Bill









-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: how to set CORE when using Apache Solr extension?

2009-12-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
the core is a part of the uri

http://host:port/solr-app/core-name/select

say if the core name is core1 and solr app name is solr deployed at port 8983
then it would look like
http://host:8983/solr/core1/select

On Tue, Dec 8, 2009 at 3:44 AM, regany re...@newzealand.co.nz wrote:

 Hello,

 Can anyone tell me how you set which Solr CORE to use when using the Apache
 Solr extension? (Using Solr with multicores)
 http://www.php.net/manual/en/book.solr.php

 thanks,
 regan
 --
 View this message in context: 
 http://old.nabble.com/how-to-set-CORE-when-using-Apache-Solr-extension--tp26685174p26685174.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Oddly slow replication

2009-12-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
this has to be a network problem . We have never encountered such
vastly different speeds in the same LAN.

On Tue, Dec 8, 2009 at 3:22 AM, Simon Wistow si...@thegestalt.org wrote:
 I have a Master server with two Slaves populated via Solr 1.4 native
 replication.

 Slave1 syncs at a respectable speed i.e around 100MB/s but Slave2 runs
 much, much slower - the peak I've seen is 56KB/s.

 Both are running off the same hardware with the same config -
 compression is set to 'internal' and http(Conn|Read)Timeout are defaults
 (5000/1).

 I've checked too see if it was a disk problem using dd and if it was a
 network problem by doing a manual scp and an rsync from the slave to the
 master and the master to the slave.

 I've shut down the replication polling on Slave1 just to see if that was
 causing the problem but there's been no improvement.

 Any ideas?






-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: How to setup dynamic multicore replication

2009-12-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Dec 8, 2009 at 2:43 PM, Thijs vonk.th...@gmail.com wrote:
 Hi

 I need some help setting up dynamic multicore replication.

 We are changing our setup from a replicated single core index with multiple
 document types, as described on the wiki[1], to a dynamic multicore setup.
 We need this so that we can display facets with a zero count that are unique
 to the document 'type'.

 So when indexing new documents we want to create new cores on the fly using
 the CoreAdminHandler through SolrJ.

 What I can't figure out is how I setup solr.xml and solrconfig.xml so that a
 core automatically is also replicated from the master to it's slaves once
 it's created.

 I have a solr.xml that starts like this:

 ?xml version='1.0' encoding='UTF-8'?
 solr persistent=true
  cores adminPath=/admin/cores
  /cores
 /solr

 and the replication part of solrconfig.xml
 master:
 requestHandler name=/replication class=solr.ReplicationHandler
  lst name=master
    str name=replicateAfterstartup/str
    str name=replicateAfteroptimize/str
    str name=confFilesschema.xml/str
  /lst
 /requestHandler

 slave:
 requestHandler name=/replication class=solr.ReplicationHandler
  lst name=slave
    str name=masterUrlhttp://localhost:8081/solr/replication/str
    str name=pollInterval00:00:20/str
  /lst
 /requestHandler

 I think I should change the masterUrl in the slave configuration to
 something like:
 str
 name=masterUrlhttp://localhost:8081/solr/${solr.core.name}/replication/str
 So that the replication automatically finds the correct core replication
 handler.
if you have dynamically created cores this is the solution.

 But how do I tell the slaves a new core is created, and that is should start
 replicating those to?

 Thanks in advance.

 Thijs

 [1]
 http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Tika and DIH integration (https://issues.apache.org/jira/browse/SOLR-1358)

2009-12-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
we are very close to resolving SOLR-1358 . So you may be able to use it

On Tue, Dec 8, 2009 at 5:32 PM, Jorg Heymans jorg.heym...@gmail.com wrote:
 Hi,

 I am looking into using Solr for indexing a large database that has
 documents (mostly pdf and msoffice) stored as CLOBs in several tables.
 It is my understanding that the DIH as provided in Solr 1.4 cannot
 index these CLOBs yet, and that SOLR-1358 should provide exactly this.
 So i was wondering what the most 'recommended' way is of solving this
 .. Should it be done with a custom textextractor of some sort, set on
 the column/field ?

 Thanks,
 Jorg




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Replicating multiple cores

2009-12-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Dec 9, 2009 at 6:14 AM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 Yes. I'd highly recommend using the Java replication though.

 Is there a reason?  I understand it's new etc, however I think one
 issue with it is it's somewhat non-native access to the filesystem.
 Can you illustrate a real world advantage other than the enhanced
 admin screens?
Complexity is the main problem w/ rsync based replication. you have to
manage so many processes and monitor them separately. The other
problem is managing snapshots. These snapshots need to be cleaned up
every now and then. You do not have enough info on what is
heppening/happened

 On Mon, Dec 7, 2009 at 11:13 PM, Shalin Shekhar Mangar
 shalinman...@gmail.com wrote:
 On Tue, Dec 8, 2009 at 11:48 AM, Jason Rutherglen 
 jason.rutherg...@gmail.com wrote:

 If I've got multiple cores on a server, I guess I need multiple
 rsyncd's running (if using the shell scripts)?


 Yes. I'd highly recommend using the Java replication though.

 --
 Regards,
 Shalin Shekhar Mangar.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: indexing XML with solr example webapp - out of java heap space

2009-12-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
the post.jar does not stream. use curl if you are using *nix.
--Noble

On Wed, Dec 9, 2009 at 12:28 AM, Feroze Daud fero...@zillow.com wrote:
 Hi!



 I downloaded SOLR and am trying to index an XML file. This XML file is
 huge (500M).



 When I try to index it using the post.jar tool in example\exampledocs,
 I get a out of java heap space error in the SimplePostTool
 application.



 Any ideas how to fix this? Passing in -Xms1024M does not fix it.



 Feroze.









-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DIH solrconfig

2009-12-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Dec 9, 2009 at 3:34 PM, Lee Smith l...@weblee.co.uk wrote:
 Hi All

 There seems to be massive difference between the solrconfig in the DIH
 example to the one in the normal example ?

 Would I be correct in saying if I was to add the dataimport request handler
 in the solrconfig.xml thats all I will need ?

 ie:

   requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
    lst name=defaults
        str name=configdb-data-config.xml/str
    /lst
  /requestHandler

 Is this correct ?
yep . this is all you need


 Lee




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Exception encountered during replication on slave....Any clues?

2009-12-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
try the url
http://localhost:8080/postingsmaster/replication?command=indexversion
using ur browser

On Tue, Dec 8, 2009 at 9:56 PM, William Pierce evalsi...@hotmail.com wrote:
 Hi, Noble:

 When I hit the masterUrl from the slave box at

 http://localhost:8080/postingsmaster/replication

 I get the following xml response:

 ?xml version=1.0 encoding=UTF-8 ?
   - response
       - lst name=responseHeader
        int name=status0/int
       int name=QTime0/int
       /lst
       str name=statusOK/str
        str name=messageNo command/str
  /response

 And then when I look in the logs,  I see the exception that I mentioned.
 What exactly does this error mean that replication is not available.    By
 the way, when I go to the admin url for the slave and click on replication,
 I see a screen with the master url listed (as above) and the word
 unreachable after it.    And, of course, the same exception shows up in
 the tomcat logs.

 Thanks,

 - Bill

 --
 From: Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com
 Sent: Monday, December 07, 2009 9:20 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Exception encountered during replication on slaveAny clues?

 are you able to hit the
 http://localhost:8080/postingsmaster/replication using a browser from
 the slave box. if you are able to hit it what do you see?


 On Tue, Dec 8, 2009 at 3:42 AM, William Pierce evalsi...@hotmail.com
 wrote:

 Just to make doubly sure,  per tck's suggestion,  I went in and
 explicitly
 added in the port in the masterurl so that it now reads:

 http://localhost:8080/postingsmaster/replication

 Still getting the same exception...

 I am running solr 1.4, on Ubuntu karmic, using tomcat 6 and Java 1.6.

 Thanks,

 - Bill

 --
 From: William Pierce evalsi...@hotmail.com
 Sent: Monday, December 07, 2009 2:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Exception encountered during replication on slaveAny
 clues?

 tck,

 thanks for your quick response.  I am running on the default port
 (8080).
 If I copy that exact string given in the masterUrl and execute it in the
 browser I get a response from solr:

 ?xml version=1.0 encoding=UTF-8 ?
 - response
 - lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
  /lst
  str name=statusOK/str
  str name=messageNo command/str
  /response

 So the masterUrl is reachable/accessible so far as I am able to tell

 Thanks,

 - Bill

 --
 From: TCK moonwatcher32...@gmail.com
 Sent: Monday, December 07, 2009 1:50 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Exception encountered during replication on slaveAny
 clues?

 are you missing the port number in the master's url ?

 -tck



 On Mon, Dec 7, 2009 at 4:44 PM, William Pierce
 evalsi...@hotmail.comwrote:

 Folks:

 I am seeing this exception in my logs that is causing my replication
 to
 fail.    I start with  a clean slate (empty data directory).  I index
 the
 data on the postingsmaster using the dataimport handler and it
 succeeds.
  When the replication slave attempts to replicate it encounters this
 error.

 Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller
 fetchLatestIndex
 SEVERE: Master at: http://localhost/postingsmaster/replication is not
 available. Index fetch failed. Exception: Invalid version or the data
 in
 not
 in 'javabin' format

 Any clues as to what I should look for to debug this further?

 Replication is enabled as follows:

 The postingsmaster solrconfig.xml looks as follows:

 requestHandler name=/replication class=solr.ReplicationHandler 
  lst name=master
    !--Replicate on 'optimize' it can also be  'commit' --
    str name=replicateAftercommit/str
    !--If configuration files need to be replicated give the names
 here
 .
 comma separated --
    str name=confFiles/str
  /lst
  /requestHandler

 The postings slave solrconfig.xml looks as follows:

 requestHandler name=/replication class=solr.ReplicationHandler 
  lst name=slave
      !--fully qualified url for the replication handler of
 master --
      str
 name=masterUrlhttp://localhost/postingsmaster/replication
 /str
      !--Interval in which the slave should poll master .Format is
 HH:mm:ss . If this is absent slave does not poll automatically.
       But a snappull can be triggered from the admin or the http API
 --
      str name=pollInterval00:05:00/str
   /lst
  /requestHandler


 Thanks,

 - Bill









 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Custom Field sample?

2009-12-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
how exactly do you wish to query these documents?

On Fri, Dec 11, 2009 at 4:35 PM, Antonio Zippo reven...@yahoo.it wrote:
 I need to add theese features to each document

 Document1
 ---
 Argument1, positive
 Argument2, positive
 Argument3, neutral
 Argument4, positive
 Argument5, negative
 Argument6, negative

 Document2
 ---
 Argument1, negative
 Argument2, positive
 Argument3, negative
 Argument6, negative
 Argument7, neutral

 where the argument name is dynamic
 using a relational database I could use a master detail structure, but in 
 solr?
 I thought about a Map or Pair field







 
 Da: Grant Ingersoll gsing...@apache.org
 A: solr-user@lucene.apache.org
 Inviato: Gio 10 dicembre 2009, 19:47:55
 Oggetto: Re: Custom Field sample?

 Can you perhaps give a little more info on what problem you are trying to 
 solve?  FWIW, there are a lot of examples of custom FieldTypes in the Solr 
 code.


 On Dec 10, 2009, at 11:46 AM, Antonio Zippo wrote:

 Hi all,

 could you help me to create a custom field?

 I need to create a field structured like a Map
 is it possible? how to define if the search string is on key or value (or 
 both)?

 A way could be to create a char separated multivalued string field... but it 
 isn't the best way. and with facets is the worst way

 could you give me a custom field sample?


 Thanks in advance,
  Revenge



 --
 Grant Ingersoll
 http://www.lucidimagination.com/

 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using 
 Solr/Lucene:
 http://www.lucidimagination.com/search






-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: question regarding dynamic fields

2009-12-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
use a copyField to copy those fields to another field and search on that

On Mon, Dec 14, 2009 at 1:00 PM, Phanindra Reva
reva.phanin...@gmail.com wrote:
 Hello..,
             I have observed that the text or keywords which are being
 indexed using dynamicField concept are being searchable only when we
 mention field name too while querying.Am I wrong with my observation
 or  is it the default and can not be changed? I am just wondering if
 there is any route to search the text indexed using dynamicFields with
 out having to mention the field name in the query.
 Thanks.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Request Assistance with DIH

2009-12-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Sat, Dec 12, 2009 at 6:15 AM, Robbin rob...@drivesajeep.com wrote:
 I've been trying to use the DIH with oracle and would love it if someone
 could give me some pointers.  I put the ojdbc14.jar in both the Tomcat lib
 and solr home/lib.  I created a dataimport.xml and enabled it in the
 solrconfig.xml.  I go to the http://solr server/solr/admin/dataimport.jsp.
  This all seems to be fine, but I get the default page response and doesn't
 look like the connection to the oracle server is even attempted.
Did you trigger an import? what is the message on the we page and what
do the logs say?



 I'm using the Solr 1.4 release on Nov 10.
 Do I need an oracle client on the server?  I thought having the ojdbc jar
 should be sufficient.  Any help or configuration examples for setting this
 up would be much appreciated.
You need all the jars you would normally use to connect to Oracle.


 Thanks
 Robbin




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: apache-solr-common.jar

2009-12-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is no solrcommon jar anymore. you may use the solrj jar which
contains all the classes which were there in the comon jar.

On Mon, Dec 14, 2009 at 9:22 PM, gudumba l gudumba.sm...@gmail.com wrote:
 Hello All,
               I have been using apache-solr-common-1.3.0.jar in my module.
 I am planning to shift to the latest version, because of course it has more
 flexibility. But it is really strange that I dont find any corresponding jar
 of the latest version. I have serached in total apachae solr 1.4 folder
 (which is downloaded from site), but have not found any. , I am sorry, its
 really silly to request for a jar, but have no option.
 Thanks.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: solr core size on disk

2009-12-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
look at the index dir and see the size of the files . it is typically
in $SOLR_HOME/data/index

On Thu, Dec 17, 2009 at 2:56 AM, Matthieu Labour matth...@kikin.com wrote:
 Hi
 I am new to solr. Here is my question:
 How to find out the size of a solr core on disk ?
 Thank you
 matt




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: shards parameter

2009-12-17 Thread Noble Paul നോബിള്‍ नोब्ळ्
yes.
put it under the defaults section in your standard requesthandler.

On Thu, Dec 17, 2009 at 5:22 PM, pcurila p...@eea.sk wrote:

 Hello, is there any way to configure shards parameter in solrconfig.xml? So I
 do not need provide it in the url. Thanks Peter
 --
 View this message in context: 
 http://old.nabble.com/shards-parameter-tp26826908p26826908.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Is DataImportHandler ThreadSafe???

2009-12-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Sat, Dec 19, 2009 at 2:16 PM, gurudev suyalprav...@yahoo.com wrote:

 Hi,
 Just wanted to know, Is the DataImportHandler available in solr1.3
 thread-safe?. I would like to use multiple instances of data import handler
 running concurrently and posting my various set of data from DB to Index.

 Can I do this by registering the DIH multiple times with various names in
 solrconfig.xml and then invoking all concurrently to achieve maximum
 throughput? Would i need to define different data-config.xml's 
 dataimport.properties for each DIH?
yes , this should work. it is thread-safe

 If it would be possible to specify the query in data-config.xml to restrict
 one DIH from overlapping the data-set fetched by another DIH through some
 SQL clauses?

 --
 View this message in context: 
 http://old.nabble.com/Is-DataImportHandler-ThreadSafetp26853521p26853521.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Documents are indexed but not searchable

2009-12-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
just search for *:* and see if the docs are indeed there in the index.
--Noble
On Mon, Dec 21, 2009 at 9:26 AM, krosan kro...@gmail.com wrote:

 Hi,

 I'm trying to test solr for a proof of concept project, but I'm having some
 problems.

 I indexed my document, but when I search for a word which is 100% certain in
 the document, I don't get any hits.

 These are my files:

 First: my data-config.xml

 dataConfig
  dataSource type=JdbcDataSource
              driver=com.mysql.jdbc.Driver
              url=jdbc:mysql://host.com:3306/crossfire3
              user=user
              password=pass
              batchSize=1/
  document
    entity name=users
            query=select username, password, email from users
                field column=username name=username /
                field column=password name=password /
                field column=email name=email /
    /entity
  /document
 /dataConfig

 Now, I have used this in the debugger, and with commit on, and verbose on, I
 get this reply:

 http://pastebin.com/m7a460711

 This clearly states that those 2 rows have been processed and are now in the
 index.
 However, when I try to do a search with the http parameters, I get this
 response:

 For the hyperlink
 http://localhost:8080/solr/select?q=username:krosandebugQuery=on
 this is the response:
 http://pastebin.com/m7bb1dcaa

 I'm clueless on what the problem could be!

 These are my two config files:

 schema.xml: http://pastebin.com/m1fd1da58
 solrconfig.xml: http://pastebin.com/m44b73d83
 (look for krosan in the documents to see what I've added to the standard
 docs)

 Any help will be greatly appreciated!

 Thanks in advance,

 Andreas Evers
 --
 View this message in context: 
 http://old.nabble.com/Documents-are-indexed-but-not-searchable-tp26868925p26868925.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: suggestions for DIH batchSize

2009-12-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
A bigger batchSize results in increased memory usage.  I guess
performance should be slightly better with bigger values but not
verified.

On Wed, Dec 23, 2009 at 2:51 AM, Joel Nylund jnyl...@yahoo.com wrote:
 Hi,

 it looks like from looking at the code the default is 500, is the
 recommended setting for this?

 Has anyone notice any significant performance/memory tradeoffs by making
 this much bigger?

 thanks
 Joel





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Problem with simple use of DIH

2009-12-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
did you run it w/o the debug?

On Sun, Dec 27, 2009 at 6:31 PM, AHMET ARSLAN iori...@yahoo.com wrote:
 I'm trying to use DataImportHandler
 to load my index and having some strange
 results. I have two tables in my database. DPRODUC contains
 products and
 FSKUMAS contains the skus related to each product.

 This is the data-config I'm using.

 dataConfig
   dataSource type=JdbcDataSource

 driver=com.ibm.as400.access.AS400JDBCDriver

 url=jdbc:as400:IWAVE;prompt=false;naming=system

 user=IPGUI

 password=IPGUI/
   document
     entity name=dproduc
       query=select dprprd, dprdes from
 dproduc where dprprd like 'F%'
       field column=dprprd name=id
 /
       field column=dprdes name=name
 /
       entity name=fskumas
         query=select fsksku, fcoclr,
 fszsiz, fskret
           from fskumas where
 dprprd='${dproduc.DPRPRD}'
          field
 column=fsksku name=sku /
          field
 column=fcoclr name=color /
          field
 column=fszsiz name=size /
          field
 column=fskret name=price /
       /entity
     /entity
   /document
 /dataConfig

 What is the primary key of dproduc table? If it is dprprd can you try adding 
 pk=dprprd to entity name=dproduc?

 entity name=dproduc pk=dprprd  query=select dprprd, dprdes from dproduc 
 where dprprd like 'F%'







-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Problem with simple use of DIH

2009-12-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
The field names are case sensitive. But if the field tags are
missing they are mapped to corresponding solr fields in a case
insensistive way.apparently all the fields come out of you ALL CAPS
you should put the 'column' values in ALL CAPS too

On Sun, Dec 27, 2009 at 9:03 PM, Jay Fisher jay.l.fis...@gmail.com wrote:
 I did run it without debug and the result was that 0 documents were
 processed.

 The problem seems to be with the field tags that I was using to map from
 the table column names to the schema.xml field names. I switched to using an
 AS clause in the SQL statement instead and it worked.

 I think the column names may be case-sensitive, although I haven't proven
 that to be the case. I did discover that references to column names in the
 velocity template are case sensitive; ${dproduc.DPRPRD} works
 and ${dproduc.dprprd} does not.

 Thanks, Jay

 2009/12/27 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com

 did you run it w/o the debug?

 On Sun, Dec 27, 2009 at 6:31 PM, AHMET ARSLAN iori...@yahoo.com wrote:
  I'm trying to use DataImportHandler
  to load my index and having some strange
  results. I have two tables in my database. DPRODUC contains
  products and
  FSKUMAS contains the skus related to each product.
 
  This is the data-config I'm using.
 
  dataConfig
    dataSource type=JdbcDataSource
 
  driver=com.ibm.as400.access.AS400JDBCDriver
 
  url=jdbc:as400:IWAVE;prompt=false;naming=system
 
  user=IPGUI
 
  password=IPGUI/
    document
      entity name=dproduc
        query=select dprprd, dprdes from
  dproduc where dprprd like 'F%'
        field column=dprprd name=id
  /
        field column=dprdes name=name
  /
        entity name=fskumas
          query=select fsksku, fcoclr,
  fszsiz, fskret
            from fskumas where
  dprprd='${dproduc.DPRPRD}'
           field
  column=fsksku name=sku /
           field
  column=fcoclr name=color /
           field
  column=fszsiz name=size /
           field
  column=fskret name=price /
        /entity
      /entity
    /document
  /dataConfig
 
  What is the primary key of dproduc table? If it is dprprd can you try
 adding pk=dprprd to entity name=dproduc?
 
  entity name=dproduc pk=dprprd  query=select dprprd, dprdes from
 dproduc where dprprd like 'F%'
 
 
 
 



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: fl parameter and dynamic fields

2009-12-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
if you wish to search on fields using wild-card you have to use a
copyField to copy all the values of Bool_* to another field and
search on that field.


On Tue, Dec 29, 2009 at 4:14 AM, Harsch, Timothy J. (ARC-TI)[PEROT
SYSTEMS] timothy.j.har...@nasa.gov wrote:
 I use dynamic fields heavily in my SOLR config.  I would like to be able to 
 specify which fields should be returned from a query based on a pattern for 
 the field name.  For instance, given:

            dynamicField name=Bool_* type=boolean
                  indexed=true stored=true /

 I might be able to construct a query like:
 http://localhost:8080/solr/select?q=Bool_*:truerows=10

 Is there something like this in SOLR?

 Thanks,
 Tim Harsch





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: serialize SolrInputDocument to java.io.File and back again?

2009-12-31 Thread Noble Paul നോബിള്‍ नोब्ळ्
what serialization would you wish to use?

you can use java serialization or solrj helps you serialize it as xml
or javabin format
(org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec)

On Thu, Dec 31, 2009 at 6:55 AM, Phillip Rhodes rhodebumpl...@gmail.com wrote:
 I want to store a SolrInputDocument to the filesystem until it can be sent
 to the solr server via the solrj client.

 I will be using a quartz job to periodically query a table that contains a
 listing of SolrInputDocuments stored as java.io.File that need to be
 processed.

 Thanks for your time.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: replicating extension JARs

2010-01-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
jars are not replicated. It is by design. But that is not to say that
we can't do it. open an issue .

On Wed, Jan 6, 2010 at 6:20 AM, Ryan Kennedy rcken...@gmail.com wrote:
 Will the built-in Solr replication replicate extension JAR files in
 the lib directory? The documentation appears to indicate that only
 the index and any specified configuration files will be replicated,
 however if your solrconfig.xml references a class in a JAR file added
 to the lib directory then you'll need that replicated as well
 (otherwise the slave will encounter ClassDefNotFound exceptions). I'm
 wondering if I'm missing something and Solr replication will do that
 or if it's a deficiency in Solr's replication.

 Ryan




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Solr Replication Questions

2010-01-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 2:51 AM, Giovanni Fernandez-Kincade
gfernandez-kinc...@capitaliq.com wrote:
 http://wiki.apache.org/solr/SolrReplication

 I've been looking over this replication wiki and I'm still unclear on a two 
 points about Solr Replication:

 1.     If there have been small changes to the index on the master, does the 
 slave copy the entire contents of the index files that were affected?

only the delta is copied.

 a.     Let's say I add one document to the master. Presumably that causes 
 changes to the position file, amidst a few others. Does the slave download 
 the entire position file? Or just the portion that was changed?

Lucene never modifies a file which was written by previous commits. So
if you add a new document and commit , it is written to new files.
Solr replication will only replicate those new files
 2.     If you have a multi-core slave, is it possible to share one 
 configuration file (i.e. one instance directory) amidst the multiple cores, 
 and yet each core poll a different master?

 a.     Can you set the masterUrl for each core separately in the server.xml?


 Thanks for your help,
 Gio.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: readOnly=true IndexReader

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 4:26 PM, Patrick Sauts patrick.via...@gmail.com wrote:
 In the Wiki page : http://wiki.apache.org/lucene-java/ImproveSearchingSpeed,
 I've found
 -Open the IndexReader with readOnly=true. This makes a big difference when
 multiple threads are sharing the same reader, as it removes certain sources
 of thread contention.

 How to open the IndexReader with readOnly=true ?
 I can't find anything related to this parameter.

 Do the VJM parameters -Dslave=disabled or -Dmaster=disabled have any
 incidence on solr with a standart solrConfig.xml?
these are not variables used by Solr. These are just substituted in
solrconfig.xml and probably consumed by ReplicationHandler (this is
not a standard)

 Thank you for your answers.

 Patrick.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: replication -- missing field data file

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Jan 6, 2010 at 9:49 PM, Giovanni Fernandez-Kincade
gfernandez-kinc...@capitaliq.com wrote:
 I set up replication between 2 cores on one master and 2 cores on one slave. 
 Before doing this the master was working without issues, and I stopped all 
 indexing on the master.

 Now that replication has synced the index files, an .FDT field is suddenly 
 missing on both the master and the slave. Pretty much every operation (core 
 reload, commit, add document) fails with an error like the one posted below.

 How could this happen? How can one recover from such an error? Is there any 
 way to regenerate the FDT file without re-indexing everything?

 This brings me to a question about backups. If I run the 
 replication?command=backup command, where is this backup stored? I've tried 
 this a few times and get an OK response from the machine, but I don't see the 
 backup generated anywhere.
The backup is done asynchronously. So it always gives an OK response immedietly.
The backup is created in the data dir itself

 Thanks,
 Gio.

 org.apache.solr.common.SolrException: Error handling 'reload' action
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:412)
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:142)
       at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
       at 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:298)
       at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174)
       at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
       at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
       at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
       at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
       at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
       at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
       at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
       at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
       at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
       at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
       at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
       at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
       at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
       at java.lang.Thread.run(Unknown Source)
 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
 Y:\solrData\FilingsCore2\index\_a0r.fdt (The system cannot find the file 
 specified)
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
       at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:579)
       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:425)
       at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:486)
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:409)
       ... 18 more
 Caused by: java.io.FileNotFoundException: 
 Y:\solrData\FilingsCore2\index\_a0r.fdt (The system cannot find the file 
 specified)
       at java.io.RandomAccessFile.open(Native Method)
       at java.io.RandomAccessFile.lt;initgt;(Unknown Source)
       at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.lt;initgt;(SimpleFSDirectory.java:78)
       at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.lt;initgt;(SimpleFSDirectory.java:108)
       at 
 org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:65)
       at 
 org.apache.lucene.index.FieldsReader.lt;initgt;(FieldsReader.java:104)
       at 
 org.apache.lucene.index.SegmentReader$CoreReaders.openDocStores(SegmentReader.java:277)
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:640)
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:599)
       at 
 org.apache.lucene.index.DirectoryReader.lt;initgt;(DirectoryReader.java:103)
       at 
 org.apache.lucene.index.ReadOnlyDirectoryReader.lt;initgt;(ReadOnlyDirectoryReader.java:27)
       at 
 org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:73)
       at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:704)
       at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:68)
       at org.apache.lucene.index.IndexReader.open(IndexReader.java:476)
       at 

Re: replication -- missing field data file

2010-01-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
the index dir is in the name index others will be stored as
indexdate-as-number

On Wed, Jan 6, 2010 at 10:31 PM, Giovanni Fernandez-Kincade
gfernandez-kinc...@capitaliq.com wrote:
 How can you differentiate between the backup and the normal index files?

 -Original Message-
 From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble 
 Paul ??? ??
 Sent: Wednesday, January 06, 2010 11:52 AM
 To: solr-user
 Subject: Re: replication -- missing field data file

 On Wed, Jan 6, 2010 at 9:49 PM, Giovanni Fernandez-Kincade
 gfernandez-kinc...@capitaliq.com wrote:
 I set up replication between 2 cores on one master and 2 cores on one slave. 
 Before doing this the master was working without issues, and I stopped all 
 indexing on the master.

 Now that replication has synced the index files, an .FDT field is suddenly 
 missing on both the master and the slave. Pretty much every operation (core 
 reload, commit, add document) fails with an error like the one posted below.

 How could this happen? How can one recover from such an error? Is there any 
 way to regenerate the FDT file without re-indexing everything?

 This brings me to a question about backups. If I run the 
 replication?command=backup command, where is this backup stored? I've tried 
 this a few times and get an OK response from the machine, but I don't see 
 the backup generated anywhere.
 The backup is done asynchronously. So it always gives an OK response 
 immedietly.
 The backup is created in the data dir itself

 Thanks,
 Gio.

 org.apache.solr.common.SolrException: Error handling 'reload' action
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:412)
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:142)
       at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
       at 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:298)
       at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174)
       at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
       at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
       at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
       at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
       at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
       at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
       at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
       at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
       at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
       at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
       at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
       at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
       at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
       at java.lang.Thread.run(Unknown Source)
 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
 Y:\solrData\FilingsCore2\index\_a0r.fdt (The system cannot find the file 
 specified)
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
       at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:579)
       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:425)
       at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:486)
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:409)
       ... 18 more
 Caused by: java.io.FileNotFoundException: 
 Y:\solrData\FilingsCore2\index\_a0r.fdt (The system cannot find the file 
 specified)
       at java.io.RandomAccessFile.open(Native Method)
       at java.io.RandomAccessFile.lt;initgt;(Unknown Source)
       at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.lt;initgt;(SimpleFSDirectory.java:78)
       at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.lt;initgt;(SimpleFSDirectory.java:108)
       at 
 org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:65)
       at 
 org.apache.lucene.index.FieldsReader.lt;initgt;(FieldsReader.java:104)
       at 
 org.apache.lucene.index.SegmentReader$CoreReaders.openDocStores(SegmentReader.java:277)
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:640)
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:599)
       at 
 

Re: replication -- missing field data file

2010-01-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
actually it does not.
BTW, FYI, backup is just to take periodics backups not necessary for
the Replicationhandler to work

On Thu, Jan 7, 2010 at 2:37 AM, Giovanni Fernandez-Kincade
gfernandez-kinc...@capitaliq.com wrote:
 How can you tell when the backup is done?

 -Original Message-
 From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble 
 Paul ??? ??
 Sent: Wednesday, January 06, 2010 12:23 PM
 To: solr-user
 Subject: Re: replication -- missing field data file

 the index dir is in the name index others will be stored as
 indexdate-as-number

 On Wed, Jan 6, 2010 at 10:31 PM, Giovanni Fernandez-Kincade
 gfernandez-kinc...@capitaliq.com wrote:
 How can you differentiate between the backup and the normal index files?

 -Original Message-
 From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble 
 Paul ??? ??
 Sent: Wednesday, January 06, 2010 11:52 AM
 To: solr-user
 Subject: Re: replication -- missing field data file

 On Wed, Jan 6, 2010 at 9:49 PM, Giovanni Fernandez-Kincade
 gfernandez-kinc...@capitaliq.com wrote:
 I set up replication between 2 cores on one master and 2 cores on one 
 slave. Before doing this the master was working without issues, and I 
 stopped all indexing on the master.

 Now that replication has synced the index files, an .FDT field is suddenly 
 missing on both the master and the slave. Pretty much every operation (core 
 reload, commit, add document) fails with an error like the one posted below.

 How could this happen? How can one recover from such an error? Is there any 
 way to regenerate the FDT file without re-indexing everything?

 This brings me to a question about backups. If I run the 
 replication?command=backup command, where is this backup stored? I've tried 
 this a few times and get an OK response from the machine, but I don't see 
 the backup generated anywhere.
 The backup is done asynchronously. So it always gives an OK response 
 immedietly.
 The backup is created in the data dir itself

 Thanks,
 Gio.

 org.apache.solr.common.SolrException: Error handling 'reload' action
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:412)
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:142)
       at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
       at 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:298)
       at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174)
       at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
       at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
       at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
       at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
       at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
       at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
       at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
       at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
       at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
       at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
       at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
       at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
       at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
       at java.lang.Thread.run(Unknown Source)
 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
 Y:\solrData\FilingsCore2\index\_a0r.fdt (The system cannot find the file 
 specified)
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
       at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:579)
       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:425)
       at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:486)
       at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:409)
       ... 18 more
 Caused by: java.io.FileNotFoundException: 
 Y:\solrData\FilingsCore2\index\_a0r.fdt (The system cannot find the file 
 specified)
       at java.io.RandomAccessFile.open(Native Method)
       at java.io.RandomAccessFile.lt;initgt;(Unknown Source)
       at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.lt;initgt;(SimpleFSDirectory.java:78)
       at 
 

Re: Synonyms from Database

2010-01-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Sun, Jan 10, 2010 at 1:04 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
 Ravi,

 I think if your synonyms were in a DB, it would be trivial to periodically 
 dump them into a text file Solr expects.  You wouldn't want to hit the DB to 
 look up synonyms at query time...
Why query time. Can it not be done at startup time ?


 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



 - Original Message 
 From: Ravi Gidwani ravi.gidw...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, January 9, 2010 10:20:18 PM
 Subject: Synonyms from Database

 Hi :
      Is there any work done in providing synonyms from a database instead of
 synonyms.txt file ? Idea is to have a dictionary in DB that can be enhanced
 on the fly in the application. This can then be used at query time to check
 for synonyms.

 I know I am not putting thoughts to the performance implications of this
 approach, but will love to hear about others thoughts.

 ~Ravi.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Data Full Import Error

2010-01-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
You need more memory to run dataimport.


On Tue, Jan 12, 2010 at 4:46 PM, Lee Smith l...@weblee.co.uk wrote:
 Hi All

 I am trying to do a data import but I am getting the following error.

 INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 
 QTime=405
 2010-01-12 03:08:08.576::WARN:  Error for /solr/dataimport
 java.lang.OutOfMemoryError: Java heap space
 Jan 12, 2010 3:08:05 AM org.apache.solr.handler.dataimport.DataImporter 
 doFullImport
 SEVERE: Full Import failed
 java.lang.OutOfMemoryError: Java heap space
 Exception in thread btpool0-2 java.lang.OutOfMemoryError: Java heap space
 Jan 12, 2010 3:08:14 AM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: start rollback
 Jan 12, 2010 3:08:21 AM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: end_rollback
 Jan 12, 2010 3:08:23 AM org.apache.solr.update.SolrIndexWriter finalize
 SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug 
 -- POSSIBLE RESOURCE LEAK!!!

This is OK. don't bother

 Any ideas what this can be ??

 Hope you can help.

 Lee





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Data Full Import Error

2010-01-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is the way you start your solr server( -Xmx option)

On Tue, Jan 12, 2010 at 6:00 PM, Lee Smith l...@weblee.co.uk wrote:
 Thank you for your response.

 Will I just need to adjust the allowed memory in a config file or is this a 
 server issue. ?

 Sorry I know nothing about Java.

 Hope you can advise !

 On 12 Jan 2010, at 12:26, Noble Paul നോബിള്‍ नोब्ळ् wrote:

 You need more memory to run dataimport.


 On Tue, Jan 12, 2010 at 4:46 PM, Lee Smith l...@weblee.co.uk wrote:
 Hi All

 I am trying to do a data import but I am getting the following error.

 INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 
 QTime=405
 2010-01-12 03:08:08.576::WARN:  Error for /solr/dataimport
 java.lang.OutOfMemoryError: Java heap space
 Jan 12, 2010 3:08:05 AM org.apache.solr.handler.dataimport.DataImporter 
 doFullImport
 SEVERE: Full Import failed
 java.lang.OutOfMemoryError: Java heap space
 Exception in thread btpool0-2 java.lang.OutOfMemoryError: Java heap space
 Jan 12, 2010 3:08:14 AM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: start rollback
 Jan 12, 2010 3:08:21 AM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: end_rollback
 Jan 12, 2010 3:08:23 AM org.apache.solr.update.SolrIndexWriter finalize
 SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug 
 -- POSSIBLE RESOURCE LEAK!!!

 This is OK. don't bother

 Any ideas what this can be ??

 Hope you can help.

 Lee





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler - synchronous execution

2010-01-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
it can be added

On Tue, Jan 12, 2010 at 10:18 PM, Alexey Serba ase...@gmail.com wrote:
 Hi,

 I found that there's no explicit option to run DataImportHandler in a
 synchronous mode. I need that option to run DIH from SolrJ (
 EmbeddedSolrServer ) in the same thread. Currently I pass dummy stream
 to DIH as a workaround for this, but I think it makes sense to add
 specific option for that. Any objections?

 Alex




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: NullPointerException in ReplicationHandler.postCommit + question about compression

2010-01-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
When you copy paste config from wiki, just copy what you need.
excluding documentation and comments

On Wed, Jan 13, 2010 at 12:51 AM, Stephen Weiss swe...@stylesight.com wrote:
 Hi Solr List,

 We're trying to set up java-based replication with Solr 1.4 (dist tarball).
  We are running this to start with on a pair of test servers just to see how
 things go.

 There's one major problem we can't seem to get past.  When we replicate
 manually (via the admin page) things seem to go well.  However, when
 replication is triggered by a commit event on the master, the master gets a
 NullPointerException and no replication seems to take place.

 SEVERE: java.lang.NullPointerException
        at
 org.apache.solr.handler.ReplicationHandler$4.postCommit(ReplicationHandler.java:922)
        at
 org.apache.solr.update.UpdateHandler.callPostCommitCallbacks(UpdateHandler.java:78)
        at
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:411)
        at
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
        at
 org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
        at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
        at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115)
        at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361)
        at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
        at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:324)
        at
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
        at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:741)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:213)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
        at
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
        at
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)



 This is the master config:

  requestHandler name=/replication class=solr.ReplicationHandler 
    lst name=master
        !--Replicate on 'optimize'. Other values can be 'commit', 'startup'.
        It is possible to have multiple entries of this config string--
        str name=replicateAftercommit/str

        !--Create a backup after 'optimize'. Other values can be 'commit',
 'startup'.
        It is possible to have multiple entries of this config string.  Note
 that this is
        just for backup, replication does not require this. --
        !-- str name=backupAfteroptimize/str --

        !--If configuration files need to be replicated give the names here,
 separated by comma --
        str
 name=confFilessolrconfig_slave.xml:solrconfig.xml,schema.xml,synonyms.txt,stopwords.txt,elevate.xml/str

        !--The default value of reservation is 10 secs.See the documentation
 below.
        Normally , you should not need to specify this --
        str name=commitReserveDuration00:00:10/str
    /lst
  /requestHandler


 and... the slave config:

  requestHandler name=/replication class=solr.ReplicationHandler 
    lst name=slave

        !--fully qualified url for the replication handler of master . It is
 possible
         to pass on this as a request param for the fetchindex command--
        str
 name=masterUrlhttp://hostname.obscured.com:8080/solr/calendar_core/replication/str

        !--Interval in which the slave should poll master .Format is
 HH:mm:ss .
         If this is absent slave does not poll automatically.
         But a fetchindex can be triggered from the admin or the http API --
        str name=pollInterval00:00:20/str

        !-- THE FOLLOWING PARAMETERS ARE USUALLY NOT REQUIRED--
        !--to use compression while transferring the index files. The
 possible values are internal|external
        

Re: Fastest way to use solrj

2010-01-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
2010/1/19 Tim Terlegård tim.terleg...@gmail.com:
 There are a few ways to use solrj. I just learned that I can use the
 javabin format to get some performance gain. But when I try the binary
 format nothing is added to the index. This is how I try to use this:

    server = new CommonsHttpSolrServer(http://localhost:8983/solr;)
    server.setRequestWriter(new BinaryRequestWriter())
    request = new UpdateRequest()
    request.setAction(UpdateRequest.ACTION.COMMIT, true, true);
    request.setParam(stream.file, /tmp/data.bin)
    request.process(server)

 Should this work? Could there be something wrong with the file? I
 haven't found a good reference for how to create a javabin file, but
 by reading the source code I came up with this (groovy code):
BinaryRequestWriter does not read from a file and post it

    fieldId = new NamedList()
    fieldId.add(name, id)
    fieldId.add(val, 9-0)
    fieldId.add(boost, null)
    fieldText = new NamedList()
    fieldText.add(name, text)
    fieldText.add(val, Some text)
    fieldText.add(boost, null)
    fieldNull = new NamedList()
    fieldNull.add(boost, null)
    doc = [fieldNull, fieldId, fieldText]
    docs = [doc]
    root = new NamedList()
    root.add(docs, docs)
    fos = new FileOutputStream(data.bin)
    new JavaBinCodec().marshal(root, fos)

 I haven't found any examples of using stream.file like this with a
 binary file. Is it supported? Is it better/faster to use
 StreamingUpdateSolrServer and send everything over HTTP instead? Would
 code for that look something like this?

    while (moreDocs) {
        xmlDoc = readDocFromFileUsingSaxParser()
        doc = new SolrInputDocument()
        doc.addField(id, 9-0)
        doc.addField(text, Some text)
        server.add(doc)
    }

 To me it instinctively looks as if stream.file would be faster because
 it doesn't have to use HTTP and it doesn't have to create a bunch of
 SolrInputDocument objects.

 /Tim




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DIH delta import - last modified date

2010-01-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
While invoking the delta-import you may, pass the value as a request
parameter. That value can be used in the query as ${dih.request.xyz}

where as xyz is the request parameter name

On Wed, Jan 20, 2010 at 1:15 AM, Yao Ge yao...@gmail.com wrote:

 I am struggling with the concept of delta import in DIH. According the to
 documentation, the delta import will automatically record the last index
 time stamp and make it available to use for the delta query. However in many
 case when the last_modified date time stamp in the database lag behind the
 current time, the last index time stamp is the not good for delta query. Can
 I pick a different mechanism to generate last_index_time by using time
 stamp computed from the database (such as from a column of the database)?
 --
 View this message in context: 
 http://old.nabble.com/DIH-delta-import---last-modified-date-tp27231449p27231449.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Fastest way to use solrj

2010-01-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

JavaBin is a format.
use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
updateRequest, OutputStream os)

The output of this can be posted to solr and it should work



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Replication Handler Severe Error: Unable to move index file

2010-01-21 Thread Noble Paul നോബിള്‍ नोब्ळ्
is it a one off case? do you observerve this frequently?

On Thu, Jan 21, 2010 at 11:26 AM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
 It's hard to tell without poking around, but one of the first things I'd do 
 would be to look for /home/solr/cores/core8/index.20100119103919/_6qv.fnm - 
 does this file/dir really exist?  Or, rather, did it exist when the error 
 happened.

 I'm not looking at the source code now, but is that really the only error you 
 got?  No exception stack trace?

  Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



 - Original Message 
 From: Trey solrt...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Wed, January 20, 2010 11:54:43 PM
 Subject: Replication Handler Severe Error: Unable to move index file

 Does anyone know what would cause the following error?:

 10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile

      SEVERE: *Unable to move index file* from:
 /home/solr/cores/core8/index.20100119103919/_6qv.fnm to:
 /home/solr/cores/core8/index/_6qv.fnm
 This occurred a few days back and we noticed that several full copies of the
 index were subsequently pulled from the master to the slave, effectively
 evicting our live index from RAM (the linux os cache), and killing our query
 performance due to disk io contention.

 Has anyone experienced this behavior recently?  I found an old thread about
 this error from early 2009, but it looks like it was patched almost a year
 ago:
 http://old.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-td21157722.html


 Additional Relevant information:
 -We are using the Solr 1.4 official release + a field collapsing patch from
 mid December (which I believe should only affect query side, not indexing /
 replication).
 -Our Replication PollInterval for slaves checking the master is very small
 (15 seconds)
 -We have a multi-box distributed search with each box possessing multiple
 cores
 -We issue a manual (rolling) optimize across the cores on the master once a
 day (occurred ~ 1-2 hours before the above timeline)
 -maxWarmingSearchers is set to 1.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Replication Handler Severe Error: Unable to move index file

2010-01-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Jan 22, 2010 at 4:24 AM, Trey solrt...@gmail.com wrote:
 Unfortunately, when I went back to look at the logs this morning, the log
 file had been blown away... that puts a major damper on my debugging
 capabilities - so sorry about that.  As a double whammy, we optimize
 nightly, so the old index files have completely changed at this point.

 I do not remember seeing an exception / stack trace in the logs associated
 with the SEVERE *Unable to move file* entry, but we were grepping the
 logs, so if it was outputted onto another line it could have possibly been
 there.  I wouldn't really expect to see anything based upon the code in
 SnapPuller.java:

 /**
   * Copy a file by the File#renameTo() method. If it fails, it is
 considered a failure
   * p/
   * Todo may be we should try a simple copy if it fails
   */
  private boolean copyAFile(File tmpIdxDir, File indexDir, String fname,
 ListString copiedfiles) {
    File indexFileInTmpDir = new File(tmpIdxDir, fname);
    File indexFileInIndex = new File(indexDir, fname);
    boolean success = indexFileInTmpDir.renameTo(indexFileInIndex);
    if (!success) {
      LOG.error(Unable to move index file from:  + indexFileInTmpDir
              +  to:  + indexFileInIndex);
      for (String f : copiedfiles) {
        File indexFile = new File(indexDir, f);
        if (indexFile.exists())
          indexFile.delete();
      }
      delTree(tmpIdxDir);
      return false;
    }
    return true;
  }

 In terms of whether this is an off case: this is the first occurrence of
 this I have seen in the logs.  We tried to replicate the conditions under
 which the exception occurred, but were unable.  I'll send along some more
 useful info if this happens again.

 In terms of the behavior we saw: It appears that a replication occurred and
 the Unable to move file error occurred.  As a result, it looks like the
 ENTIRE index was subsequently replicated again into a temporary directory
 (several times, over and over).

 The end result was that we had multiple full copies of the index in
 temporary index folders on the slave, and the original still couldn't be
 updated (the move to ./index wouldn't work).  Does Solr ever hold files open
 in a manner that would prevent a file in the index directory from being
 overridden?

There is a TODO which says manual it try to copy if move (renameTo)
fails. We never did it because we never observed renameTo failing.


 2010/1/21 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com

 is it a one off case? do you observerve this frequently?

 On Thu, Jan 21, 2010 at 11:26 AM, Otis Gospodnetic
 otis_gospodne...@yahoo.com wrote:
  It's hard to tell without poking around, but one of the first things I'd
 do would be to look for /home/solr/cores/core8/index.20100119103919/_6qv.fnm
 - does this file/dir really exist?  Or, rather, did it exist when the error
 happened.
 
  I'm not looking at the source code now, but is that really the only error
 you got?  No exception stack trace?
 
   Otis
  --
  Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
 
 
 
  - Original Message 
  From: Trey solrt...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Wed, January 20, 2010 11:54:43 PM
  Subject: Replication Handler Severe Error: Unable to move index file
 
  Does anyone know what would cause the following error?:
 
  10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile
 
       SEVERE: *Unable to move index file* from:
  /home/solr/cores/core8/index.20100119103919/_6qv.fnm to:
  /home/solr/cores/core8/index/_6qv.fnm
  This occurred a few days back and we noticed that several full copies of
 the
  index were subsequently pulled from the master to the slave, effectively
  evicting our live index from RAM (the linux os cache), and killing our
 query
  performance due to disk io contention.
 
  Has anyone experienced this behavior recently?  I found an old thread
 about
  this error from early 2009, but it looks like it was patched almost a
 year
  ago:
 
 http://old.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-td21157722.html
 
 
  Additional Relevant information:
  -We are using the Solr 1.4 official release + a field collapsing patch
 from
  mid December (which I believe should only affect query side, not
 indexing /
  replication).
  -Our Replication PollInterval for slaves checking the master is very
 small
  (15 seconds)
  -We have a multi-box distributed search with each box possessing
 multiple
  cores
  -We issue a manual (rolling) optimize across the cores on the master
 once a
  day (occurred ~ 1-2 hours before the above timeline)
  -maxWarmingSearchers is set to 1.
 
 



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
There is no corresponding DataSurce which can be used with
TikaEntityProcessor which reads from BLOB
I have opened an issue.https://issues.apache.org/jira/browse/SOLR-1737

On Mon, Jan 25, 2010 at 10:57 PM, Shah, Nirmal ns...@columnit.com wrote:
 Hi,



 I am fairly new to Solr and would like to use the DIH to pull rich text
 files (pdfs, etc) from BLOB fields in my database.



 There was a suggestion made to use the FieldReaderDataSource with the
 recently commited TikaEntityProcessor.  Has anyone accomplished this?

 This is my configuration, and the resulting error - I'm not sure if I'm
 using the FieldReaderDataSource correctly.  If anyone could shed light
 on whether I am going the right direction or not, it would be
 appreciated.



 ---Data-config.xml:

 dataConfig

   datasource name=f1 type=FieldReaderDataSource /

   dataSource name=orcle driver=oracle.jdbc.driver.OracleDriver
 url=jdbc:oracle:thin:un/p...@host:1521:sid /

      document

      entity dataSource=orcle name=attach query=select id as name,
 attachment from testtable2

         entity dataSource=f1 processor=TikaEntityProcessor
 dataField=attach.attachment format=text

            field column=text name=NAME /

         /entity

      /entity

   /document

 /dataConfig





 -Debug error:

 response

 lst name=responseHeader

 int name=status0/int

 int name=QTime203/int

 /lst

 lst name=initArgs

 lst name=defaults

 str name=configtestdb-data-config.xml/str

 /lst

 /lst

 str name=commandfull-import/str

 str name=modedebug/str

 null name=documents/

 lst name=verbose-output

 lst name=entity:attach

 lst name=document#1

 str name=queryselect id as name, attachment from testtable2/str

 str name=time-taken0:0:0.32/str

 str--- row #1-/str

 str name=NAMEjava.math.BigDecimal:2/str

 str name=ATTACHMENToracle.sql.BLOB:oracle.sql.b...@1c8e807/str

 str-/str

 lst name=entity:253433571801723

 str name=EXCEPTION

 org.apache.solr.handler.dataimport.DataImportHandlerException: No
 dataSource :f1 available for entity :253433571801723 Processing Document
 # 1

                at
 org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(Da
 taImporter.java:279)

                at
 org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl
 .java:93)

                at
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntit
 yProcessor.java:97)

                at
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Entity
 ProcessorWrapper.java:237)

                at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
 ava:357)

                at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
 ava:383)

                at
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java
 :242)

                at
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:18
 0)

                at
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporte
 r.java:331)

                at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java
 :389)

                at
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(D
 ataImportHandler.java:203)

                at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
 ase.java:131)

                at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)

                at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
 va:338)

                at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
 ava:241)

                at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
 dler.java:1089)

                at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)

                at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
 16)

                at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)

                at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)

                at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)

                at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
 Collection.java:211)

                at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.jav
 a:114)

                at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)

                at org.mortbay.jetty.Server.handle(Server.java:285)

                at
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)

                at
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
 ction.java:821)

                at
 org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)

 

Re: Fastest way to use solrj

2010-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
if you write only a few docs you may not observe much difference in
size. if you write large no:of docs you may observe a big difference.

2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I got the binary format to work perfectly now. Performance is better
 than with xml. Thanks!

 Although, it doesn't look like a binary file is smaller in size than
 an xml file?

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/21 Tim Terlegård tim.terleg...@gmail.com:
 Yes, it worked! Thank you very much. But do I need to use curl or can
 I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't
 use BinaryWriter then I don't know how to do this.
 if your data is serialized using JavaBinUpdateRequestCodec, you may
 POST it using curl.
 If you are writing directly , use CommonsHttpSolrServer

 /Tim

 2010/1/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

 JavaBin is a format.
 use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
 updateRequest, OutputStream os)

 The output of this can be posted to solr and it should work



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Fastest way to use solrj

2010-01-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
how many fields are there in each doc? the binary format just reduces
overhead. it does not touch/compress the payload

2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 3 millon documents, each having 5000 chars. The xml file is
 about 15GB. The binary file is also about 15GB.

 I was a bit surprised about this. It doesn't bother me much though. At
 least it performs better.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 if you write only a few docs you may not observe much difference in
 size. if you write large no:of docs you may observe a big difference.

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I got the binary format to work perfectly now. Performance is better
 than with xml. Thanks!

 Although, it doesn't look like a binary file is smaller in size than
 an xml file?

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/21 Tim Terlegård tim.terleg...@gmail.com:
 Yes, it worked! Thank you very much. But do I need to use curl or can
 I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't
 use BinaryWriter then I don't know how to do this.
 if your data is serialized using JavaBinUpdateRequestCodec, you may
 POST it using curl.
 If you are writing directly , use CommonsHttpSolrServer

 /Tim

 2010/1/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried 
 this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

 JavaBin is a format.
 use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
 updateRequest, OutputStream os)

 The output of this can be posted to solr and it should work



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Fastest way to use solrj

2010-01-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
The binary format just reduces overhead. in your case , all the data
is in the big text field which is not compressed. But overall, the
parsing is a lot faster for the binary format. So you see a perf boost

2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 6 fields. The text field is the biggest, it contains almost all
 of the 5000 chars.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 how many fields are there in each doc? the binary format just reduces
 overhead. it does not touch/compress the payload

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 3 millon documents, each having 5000 chars. The xml file is
 about 15GB. The binary file is also about 15GB.

 I was a bit surprised about this. It doesn't bother me much though. At
 least it performs better.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 if you write only a few docs you may not observe much difference in
 size. if you write large no:of docs you may observe a big difference.

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I got the binary format to work perfectly now. Performance is better
 than with xml. Thanks!

 Although, it doesn't look like a binary file is smaller in size than
 an xml file?

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/21 Tim Terlegård tim.terleg...@gmail.com:
 Yes, it worked! Thank you very much. But do I need to use curl or can
 I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't
 use BinaryWriter then I don't know how to do this.
 if your data is serialized using JavaBinUpdateRequestCodec, you may
 POST it using curl.
 If you are writing directly , use CommonsHttpSolrServer

 /Tim

 2010/1/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried 
 this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

 JavaBin is a format.
 use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
 updateRequest, OutputStream os)

 The output of this can be posted to solr and it should work



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Help using CachedSqlEntityProcessor

2010-01-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
cacheKey and cacheLookup are required attributes .

On Thu, Jan 28, 2010 at 12:51 AM, KirstyS kirst...@gmail.com wrote:

 Thanks. I am on 1.4..so maybe that is the problem.
 Will try when I get back to work tomorrow.
 Thanks


 Rolf Johansson-2 wrote:

 I recently had issues with CachedSqlEntityProcessor too, figuring out how
 to
 use the syntax. After a while, I managed to get it working with cacheKey
 and
 cacheLookup. I think this is 1.4 specific though.

 It seems you have double WHERE clauses, one in the query and one in the
 where attribute.

 Try using cacheKey and cacheLookup instead in something like this:

 entity name=LinkedCategory pk=LinkedCatArticleId
         query=SELECT LinkedCategoryBC, CmsArticleId as LinkedCatAricleId
                FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
         processor=CachedSqlEntityProcessor
         cacheKey=LINKEDCATARTICLEID
         cacheLookup=article.CMSARTICLEID
         deltaQuery=SELECT LinkedCategoryBC
                     FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
                     WHERE convert(varchar(50), LastUpdateDate) 
                     '${dataimporter.article.last_index_time}'
                     OR convert(varchar(50), PublishDate) 
                     '${dataimporter.article.last_index_time}'
         parentDeltaQuery=SELECT * from vArticleSummaryDetail_SolrSearch
                          (nolock)
     field column=LinkedCategoryBC name=LinkedCategoryBreadCrumb/
 /entity

 /Rolf


 Den 2010-01-27 12.36, skrev KirstyS kirst...@gmail.com:


 Hi, I have looked on the wiki. Using the CachedSqlEntityProcessor looks
 like
 it was simple. But I am getting no speed benefit and am not sure if I
 have
 even got the syntax correct.
 I have a main root entity called 'article'.

 And then I have a number of sub entities. One such entity is as such :

     entity name=LinkedCategory pk=LinkedCatAricleId
               query=SELECT LinkedCategoryBC, CmsArticleId as
 LinkedCatAricleId
                      FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
                      WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}') 
                 processor=CachedSqlEntityProcessor
                 WHERE=LinkedCatArticleId = article.CmsArticleId
                 deltaQuery=SELECT LinkedCategoryBC
                             FROM LinkedCategoryBreadCrumb_SolrSearch
 (nolock)
                             WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}')
                             AND (convert(varchar(50), LastUpdateDate) 
 '${dataimporter.article.last_index_time}'
                             OR   convert(varchar(50), PublishDate) 
 '${dataimporter.article.last_index_time}')
                 parentDeltaQuery=SELECT * from
 vArticleSummaryDetail_SolrSearch (nolock)
                                  WHERE convert(varchar(50), CmsArticleId)
 =
 convert(varchar(50), '${article.CmsArticleId}')
         field column=LinkedCategoryBC
 name=LinkedCategoryBreadCrumb/
       /entity


 As you can see I have added (for the main query - not worrying about the
 delta queries yet!!) the processor and the 'where' but not sure if it's
 correct.
 Can anyone point me in the right direction???
 Thanks
 Kirsty




 --
 View this message in context: 
 http://old.nabble.com/Help-using-CachedSqlEntityProcessor-tp27337635p27345412.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Help using CachedSqlEntityProcessor

2010-01-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
Thanks for pointing this out. The wiki had a problem fro a while and
we could not update the documentation. It is updated here
http://wiki.apache.org/solr/DataImportHandler#cached

On Thu, Jan 28, 2010 at 6:31 PM, KirstyS kirst...@gmail.com wrote:

 Thanks,
 I saw that mistake and I have it working now!!! thank you for all your help.
 Out of interest, is the cacheKey and cacheLookup documented anywhere?



 Rolf Johansson-2 wrote:

 It's always a good thing if you can check the debug log (fx catalina.out)
 or
 run with debug/verbose to check how Solr runs trough the dataconfig.

 You've also made a typo in the pk and query, LinkedCatAricleId is
 missing
 a t.

 /Rolf

 Den 2010-01-28 11.20, skrev KirstyS kirst...@gmail.com:


 Okay, I changed my entity to look like this (have included my main entity
 as
 well):
  document name=ArticleDocument
     entity name=article pk=CmsArticleId
             query=Select * from vArticleSummaryDetail_SolrSearch
 (nolock)
 WHERE ArticleStatusId = 1

       entity name=LinkedCategory pk=LinkedCatAricleId
               query=SELECT LinkedCategoryBC, CmsArticleId as
 LinkedCatAricleId
                      FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
                      processor=CachedSqlEntityProcessor
                      cacheKey=LinkedCatArticleId
                      cacheLookup=article.CmsArticleId
       /entity
 /entity
 /document

 BUT now the index is taking SO much longer Have I missed
 any
 other configurationg changes? Do I need to add anything into the
 solfconfig.xml file?  Do I have my syntax completely wrong?

 Any help is greatly appreciated!!!




 --
 View this message in context: 
 http://old.nabble.com/Help-using-CachedSqlEntityProcessor-tp27337635p27355501.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Solr 1.4 Replication index directories

2010-01-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
the index.20100127044500/ is a temp directory should have got cleaned
up if there was no problem in replication (see the logs if there was a
problem) . if there is a problem the temp directory will be used as
the new index directory and the old one will no more be used.at any
given point only one directory is used for the index. check the
replication dashboard to check which one it is. Everything else can be
deleted.

On Fri, Jan 29, 2010 at 6:03 AM, mark angelillo li...@snooth.com wrote:
 Thanks, Otis. Responses inline.


 Hi,

 We're using the new replication and it's working pretty well. There's one
 detail
 I'd like to get some more information about.

 As the replication works, it creates versions of the index in the data
 directory. Originally we had index/, but now there are dated versions
 such as
 index.20100127044500/, which are the replicated versions.

 Each copy is sized in the vicinity of 65G. With our current hard drive
 it's fine
 to have two around, but 3 gets a little dicey. Sometimes we're finding
 that the
 replication doesn't always clean up after itself. I would like to
 understand
 this better, or to not have this happen. It could be a configuration
 issue.

 Some more specific questions:

 - Is it safe to remove the index/ directory (that doesn't have the date
 on it)?
 I think I tried this once and the whole thing broke, however maybe
 something
 else was wrong at the time.

 No, that's the real, live index, you don't want to remove that one.


 Yeah... I tried it once and remember things breaking.

 However nothing in this directory has been modified for over a week (since
 the last replication initialization). And I'm still sitting on 130GB of data
 for what is only 65GB on the master




 - Is there a way to know which one is the current one? (I'm looking at
 the file
 index.properties, and it seems to be correct, but sometimes there's a
 newer
 version in the directory, which later is removed)

 I think the index one is always current, no?  If not, I imagine the
 admin replication page will tell you, or even the Statistics page.
 e.g.
 reader :
  SolrIndexReader{this=46a55e,r=readonlysegmentrea...@46a55e,segments=1}
 readerDir :
  org.apache.lucene.store.NIOFSDirectory@/mnt/solrhome/cores/foo/data/index


 reader :
 SolrIndexReader{this=5c3aef1,r=readonlydirectoryrea...@5c3aef1,refCnt=1,segments=9}
 readerDir :
 org.apache.lucene.store.NIOFSDirectory@/home/solr/solr_1.4/solr/data/index.20100127044500





 - Could it be that the index does not finish replicating in the poll
 interval I
 give it? What happens if, say there's a poll interval X and replicating
 the
 index happens to take longer than X sometimes. (Our current poll interval
 is 45
 minutes, and every time I'm watching it it completes in time.)

you can keep a very small pollInterval and it is OK. if a replication
is going on no new replication will be initiated till the old one
completes


 I think only 1 replication will/should be happening at a time.

 Whew, that's comforting.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: loading an updateProcessorChain with multicore in trunk

2010-01-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess . default=true should not be necessary if there is only one
updateRequestProcessorChain specified . Open an issue

On Fri, Jan 29, 2010 at 6:06 PM, Marc Sturlese marc.sturl...@gmail.com wrote:

 I am testing trunk and have seen a different behaviour when loading
 updateProcessors wich I don't know if it's normal (at least with multicore)
 Before I use to use an updateProcessorChain this way:

 requestHandler name=/update class=solr.XmlUpdateRequestHandler
    lst name=defaults
       str name=update.processormyChain/str
    /lst
 /requestHandler
 updateRequestProcessorChain name=myChain
    processor
 class=org.apache.solr.update.processor.CustomUpdateProcessorFactory /
    processor
 class=org.apache.solr.update.processor.LogUpdateProcessorFactory /
    processor
 class=org.apache.solr.update.processor.RunUpdateProcessorFactory /
 /updateRequestProcessorChain

 It does not work in current trunk. I have debuged the code and I have seen
 now UpdateProcessorChain is loaded via:

  public T T initPlugins(ListPluginInfo pluginInfos, MapString, T
 registry, ClassT type, String defClassName) {
    T def = null;
    for (PluginInfo info : pluginInfos) {
      T o = createInitInstance(info,type, type.getSimpleName(),
 defClassName);
      registry.put(info.name, o);
      if(info.isDefault()){
            def = o;
      }
    }
    return def;
  }

 As I don't have default=true in the configuration, my custom
 processorChain is not used. Setting default=true makes it work:

 requestHandler name=/update class=solr.XmlUpdateRequestHandler
    lst name=defaults
       str name=update.processormyChain/str
    /lst
 /requestHandler
 updateRequestProcessorChain name=myChain default=true
    processor
 class=org.apache.solr.update.processor.CustomUpdateProcessorFactory /
    processor
 class=org.apache.solr.update.processor.LogUpdateProcessorFactory /
    processor
 class=org.apache.solr.update.processor.RunUpdateProcessorFactory /
 /updateRequestProcessorChain

 As far as I understand, if you specify the chain you want to use in here:
 requestHandler name=/update class=solr.XmlUpdateRequestHandler
    lst name=defaults
       str name=update.processormyChain/str
    /lst
 /requestHandler

 Shouldn't be necesary to set it as default.
 Is it going to be kept this way?

 Thanks in advance



 --
 View this message in context: 
 http://old.nabble.com/loading-an-updateProcessorChain-with-multicore-in-trunk-tp27371375p27371375.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler problem - reading XML from a file

2010-01-31 Thread Noble Paul നോബിള്‍ नोब्ळ्
It clear that the xpaths provided won't fetch anything. because there
is no data in those paths. what do you really wish to be indexed ?



On Sun, Jan 31, 2010 at 10:30 AM, Lance Norskog goks...@gmail.com wrote:
 This DataImportHandler script does not find any documents in this HTML
 file. The DIH definitely opens the file, but the either the
 xpathprocessor gets no data or it does not recognize the xpaths
 described. Any hints? (I'm using Solr 1.5-dev, sometime recent.)

 Thanks!

 Lance


 xhtml-data-config.xml:

 dataConfig
        dataSource type=FileDataSource encoding=UTF-8 /
        document
        entity name=xhtml
                        forEach=/html/head | /html/body
                        processor=XPathEntityProcessor pk=id
                        transformer=TemplateTransformer
                        url=/cygwin/tmp/ch05-tokenizers-filters-Solr1.4.html
                        
                field column=head_s xpath=/html/head/
                field column=body_s xpath=/html/body/
        /entity
        /document
 /dataConfig

 Sample data file: cygwin/tmp/ch05-tokenizers-filters-Solr1.4.html

 ?xml version=1.0 encoding=UTF-8 ?
 html 
  head 
    meta content=en-US name=DC.language /
  /head
  body
    div id=header
     a href=ch05-tokenizers-filters-Solr1.4.htmlFirst/a
        span class=nolinkPrevious/span
        a href=ch05-tokenizers-filters-Solr1.41.htmlNext/a
        a href=ch05-tokenizers-filters-Solr1.460.htmlLast/a
    /div
    div dir=ltr id=content style=background-color:transparent
      h1 id=toc0
        span class=SectionNumber1/span
        a id=RefHeading36402771/a
        a id=bkmRefHeading36402771/a
        Understanding Analyzers, Tokenizers, and Filters
      /h1
    /div
  /body
 /html



 --
 Lance Norskog
 goks...@gmail.com




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: replication setup

2010-01-31 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is always recommended to paste your actual configuration and
startup commands, instead of saying as described in wiki .

On Tue, Jan 26, 2010 at 9:52 PM, Matthieu Labour
matthieu_lab...@yahoo.com wrote:
 Hi



 I have set up replication following the wiki

 I downloaded the latest apache-solr-1.4 release and exploded it in 2 
 different directories
 I modified both solrconfig.xml for the master  the slave as described on the 
 wiki page
 In both sirectory, I started solr from the example directory
 example on the master:
 java -Dsolr.solr.home=multicore -Djetty.host=0.0.0.0 -Djetty.port=8983 
 -DSTOP.PORT=8078 -DSTOP.KEY=stop.now -jar start.jar

 and on the slave
 java -Dsolr.solr.home=multicore -Djetty.host=0.0.0.0 -Djetty.port=8982 
 -DSTOP.PORT=8077 -DSTOP.KEY=stop.now -jar start.jar



 I can see core0 and core 1 when I open the solr url
 However, I don't see a replication link and
 the following url  solr url / replication returns a 404 error



 I must be doing something wrong. I would appreciate any help !



 thanks a lot

 matt








-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler delta-import confusion

2010-02-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
try
deltaImportQuery=select [bunch of stuff]
   WHERE m.moment_id = '${dataimporter.delta.moment_id}'

The key has to be same and in the same case

On Tue, Feb 2, 2010 at 1:45 AM, Jon Drukman jdruk...@gmail.com wrote:
 First, let me just say that DataImportHandler is fantastic. It got my old
 mysql-php-xml index rebuild process down from 30 hours to 6 minutes.

 I'm trying to use the delta-import functionality now but failing miserably.

 Here's my entity tag:  (some SELECT statements reduced to increase
 readability)

 entity name=moment
  query=select ...

  deltaQuery=select moment_id from moments where date_modified 
 '${dataimporter.last_index_time}'

  deltaImportQuery=select [bunch of stuff]
    WHERE m.moment_id = '${dataimporter.delta.MOMENTID}'

  pk=MOMENTID

  transformer=TemplateTransformer

 When I look at the MySQL query log I see the date modified query running
 fine and returning 3 rows.  The deltaImportQuery, however, does not have the
 proper primary key in the where clause.  It's just blank.  I also tried
 changing it to ${moment.MOMENTID}.

 I don't really get the relation between the pk field and the
 ${dataimport.delta.whatever} stuff.

 Help please!
 -jsd-






-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler delta-import confusion

2010-02-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
Please do not hijack a thread. http://people.apache.org/~hossman/#threadhijack

On Tue, Feb 2, 2010 at 11:32 PM, Leann Pereira
le...@1sourcestaffing.com wrote:
 Hi Paul,

 Can you take me off this distribution list?

 Thanks,

 Leann

 
 From: noble.p...@gmail.com [noble.p...@gmail.com] On Behalf Of Noble Paul 
 നോബിള്‍  नोब्ळ् [noble.p...@corp.aol.com]
 Sent: Tuesday, February 02, 2010 2:12 AM
 To: solr-user@lucene.apache.org
 Subject: Re: DataImportHandler delta-import confusion

 try
 deltaImportQuery=select [bunch of stuff]
   WHERE m.moment_id = '${dataimporter.delta.moment_id}'

 The key has to be same and in the same case

 On Tue, Feb 2, 2010 at 1:45 AM, Jon Drukman jdruk...@gmail.com wrote:
 First, let me just say that DataImportHandler is fantastic. It got my old
 mysql-php-xml index rebuild process down from 30 hours to 6 minutes.

 I'm trying to use the delta-import functionality now but failing miserably.

 Here's my entity tag:  (some SELECT statements reduced to increase
 readability)

 entity name=moment
  query=select ...

  deltaQuery=select moment_id from moments where date_modified 
 '${dataimporter.last_index_time}'

  deltaImportQuery=select [bunch of stuff]
    WHERE m.moment_id = '${dataimporter.delta.MOMENTID}'

  pk=MOMENTID

  transformer=TemplateTransformer

 When I look at the MySQL query log I see the date modified query running
 fine and returning 3 rows.  The deltaImportQuery, however, does not have the
 proper primary key in the where clause.  It's just blank.  I also tried
 changing it to ${moment.MOMENTID}.

 I don't really get the relation between the pk field and the
 ${dataimport.delta.whatever} stuff.

 Help please!
 -jsd-






 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler - convertType attribute

2010-02-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
implicit conversion can cause problem when Transformers are applied.
It is hard for user to guess the type of the field by looking at the
schema.xml. In Solr, String is the most commonly used type. if you
wish to do numeric operations on a field convertType will cause
problems.
 If it is explicitly set, user knows why the type got changed.

On Tue, Feb 2, 2010 at 6:38 PM, Alexey Serba ase...@gmail.com wrote:
 Hello,

 I encountered blob indexing problem and found convertType solution in
 FAQhttp://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my_table_are_added_to_the_Solr_document_as_object_strings_like_B.401f23c5

 I was wondering why it is not enabled by default and found the
 following comment
 http://www.lucidimagination.com/search/document/169e6cc87dad5e67/dataimporthandler_and_blobs#169e6cc87dad5e67in
 mailing list:

 We used to attempt type conversion from the SQL type to the field's given
 type. We
 found that it was error prone and switched to using the ResultSet#getObject
 for all columns (making the old behavior a configurable option –
 convertType in JdbcDataSource).

 Why it is error prone? Is it safe enough to enable convertType for all jdbc
 data sources by default? What are the side effects?

 Thanks in advance,
 Alex




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler - convertType attribute

2010-02-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Feb 3, 2010 at 3:31 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 One thing I find awkward about convertType is that it is JdbcDataSource
 specific, rather than field-specific.  Isn't the current implementation far
 too broad?
it is feature of JdbcdataSource and no other dataSource offers it. we
offer it because JDBC drivers have mechanism to do type conversion

What do you mean by it is too broad?


        Erik

 On Feb 3, 2010, at 1:16 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:

 implicit conversion can cause problem when Transformers are applied.
 It is hard for user to guess the type of the field by looking at the
 schema.xml. In Solr, String is the most commonly used type. if you
 wish to do numeric operations on a field convertType will cause
 problems.
 If it is explicitly set, user knows why the type got changed.

 On Tue, Feb 2, 2010 at 6:38 PM, Alexey Serba ase...@gmail.com wrote:

 Hello,

 I encountered blob indexing problem and found convertType solution in

 FAQhttp://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my_table_are_added_to_the_Solr_document_as_object_strings_like_B.401f23c5

 I was wondering why it is not enabled by default and found the
 following comment

 http://www.lucidimagination.com/search/document/169e6cc87dad5e67/dataimporthandler_and_blobs#169e6cc87dad5e67in
 mailing list:

 We used to attempt type conversion from the SQL type to the field's
 given
 type. We
 found that it was error prone and switched to using the
 ResultSet#getObject
 for all columns (making the old behavior a configurable option –
 convertType in JdbcDataSource).

 Why it is error prone? Is it safe enough to enable convertType for all
 jdbc
 data sources by default? What are the side effects?

 Thanks in advance,
 Alex




 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DataImportHandler - convertType attribute

2010-02-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Feb 3, 2010 at 4:16 PM, Erik Hatcher erik.hatc...@gmail.com wrote:

 On Feb 3, 2010, at 5:36 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:

 On Wed, Feb 3, 2010 at 3:31 PM, Erik Hatcher erik.hatc...@gmail.com
 wrote:

 One thing I find awkward about convertType is that it is JdbcDataSource
 specific, rather than field-specific.  Isn't the current implementation
 far
 too broad?

 it is feature of JdbcdataSource and no other dataSource offers it. we
 offer it because JDBC drivers have mechanism to do type conversion

 What do you mean by it is too broad?

 I mean the convertType flag is not field-specific (or at least field
 overridable).  Conversions occur on a per-field basis, but the setting is
 for the entire data source and thus all fields.
Yes. it is true.
First of all this is not very widely used, so fine tuning did not make sense

        Erik





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Feb 4, 2010 at 10:50 AM, Lance Norskog goks...@gmail.com wrote:
 I just tested this with a DIH that does not use database input.

 If the DataImportHandler JDBC code does not support a schema that has
 optional fields, that is a major weakness. Noble/Shalin, is this true?
The problem is obviously not with DIH. DIH blindly passes on all the
fields it could obtain from the DB. if some field is missing DIH does
not do anything

 On Tue, Feb 2, 2010 at 8:50 AM, Sascha Szott sz...@zib.de wrote:
 Hi,

 since some of the fields used in your DIH configuration aren't mandatory
 (e.g., keywords and tags are defined as nullable in your db table schema),
 add a default value to all optional fields in your schema configuration
 (e.g., default = ). Note, that Solr does not understand the db-related
 concept of null values.

 Solr's log output

 SolrInputDocument[{keywords=keywords(1.0)={Dolce}, name=name(1.0)={Dolce
 amp; Gabbana Damp;G Neckties designer Tie for men 543},
 productID=productID(1.0)={220213}}]

 indicates that there aren't any tags or descriptions stored for the item
 with productId 220213. Since no default value is specified, Solr raises an
 error when creating the index document.

 -Sascha

 Jean-Michel Philippon-Nadeau wrote:

 Hi,

 Thanks for the reply.

 On Tue, 2010-02-02 at 16:57 +0100, Sascha Szott wrote:

 * the output of MySQL's describe command for all tables/views referenced
 in your DIH configuration

 mysql  describe products;

 ++--+--+-+-++
 | Field          | Type             | Null | Key | Default | Extra
 |

 ++--+--+-+-++
 | productID      | int(10) unsigned | NO   | PRI | NULL    |
 auto_increment |
 | skuCode        | varchar(320)     | YES  | MUL | NULL    |
 |
 | upcCode        | varchar(320)     | YES  | MUL | NULL    |
 |
 | name           | varchar(320)     | NO   |     | NULL    |
 |
 | description    | text             | NO   |     | NULL    |
 |
 | keywords       | text             | YES  |     | NULL    |
 |
 | disqusThreadID | varchar(50)      | NO   |     | NULL    |
 |
 | tags           | text             | YES  |     | NULL    |
 |
 | createdOn      | int(10) unsigned | NO   |     | NULL    |
 |
 | lastUpdated    | int(10) unsigned | NO   |     | NULL    |
 |
 | imageURL       | varchar(320)     | YES  |     | NULL    |
 |
 | inStock        | tinyint(1)       | YES  | MUL | 1       |
 |
 | active         | tinyint(1)       | YES  |     | 1       |
 |

 ++--+--+-+-++
 13 rows in set (0.00 sec)

 mysql  describe product_soldby_vendor;
 +-+--+--+-+-+---+
 | Field           | Type             | Null | Key | Default | Extra |
 +-+--+--+-+-+---+
 | productID       | int(10) unsigned | NO   | MUL | NULL    |       |
 | productVendorID | int(10) unsigned | NO   | MUL | NULL    |       |
 | price           | double           | NO   |     | NULL    |       |
 | currency        | varchar(5)       | NO   |     | NULL    |       |
 | buyURL          | varchar(320)     | NO   |     | NULL    |       |
 +-+--+--+-+-+---+
 5 rows in set (0.00 sec)

 mysql  describe products_vendors_subcategories;

 ++--+--+-+-++
 | Field                      | Type             | Null | Key | Default |
 Extra          |

 ++--+--+-+-++
 | productVendorSubcategoryID | int(10) unsigned | NO   | PRI | NULL    |
 auto_increment |
 | productVendorCategoryID    | int(10) unsigned | NO   |     | NULL    |
 |
 | labelEnglish               | varchar(320)     | NO   |     | NULL    |
 |
 | labelFrench                | varchar(320)     | NO   |     | NULL    |
 |

 ++--+--+-+-++
 4 rows in set (0.00 sec)

 mysql  describe products_vendors_categories;

 +-+--+--+-+-++
 | Field                   | Type             | Null | Key | Default |
 Extra          |

 +-+--+--+-+-++
 | productVendorCategoryID | int(10) unsigned | NO   | PRI | NULL    |
 auto_increment |
 | labelEnglish            | varchar(320)     | NO   |     | NULL    |
 |
 | labelFrench             | varchar(320)     | NO   |     | NULL    |
 |

 +-+--+--+-+-++
 3 rows in set (0.00 sec)

 mysql  describe product_vendor_in_subcategory;
 +---+--+--+-+-+---+
 | Field             | Type             | Null | Key | Default | 

Re: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-02-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
unfortunately, no

On Fri, Feb 5, 2010 at 2:23 PM, Jorg Heymans jorg.heym...@gmail.com wrote:
 dow, thanks for that Paul :-|

 I suppose schema validation for data-config.xml is already in Jira somewhere
 ?

 Jorg

 2010/2/5 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com

 wrong   datasource name=orablob type=FieldStreamDataSource /
 right     dataSource name=orablob type=FieldStreamDataSource /

 On Thu, Feb 4, 2010 at 9:27 PM, Jorg Heymans jorg.heym...@gmail.com
 wrote:
  Hi,
  I'm having some troubles getting this to work on a snapshot from 3rd feb
  My
  config looks as follows
      dataSource name=ora driver=oracle.jdbc.OracleDriver url=
 /
      datasource name=orablob type=FieldStreamDataSource /
      document name=mydoc
          entity dataSource=ora name=meta query=select id, filename,
  bytes from documents 
              field column=ID name=id /
              field column=FILENAME name=filename /
              entity dataSource=orablob processor=TikaEntityProcessor
  url=bytes dataField=meta.BYTES
                field column=text name=mainDocument/
              /entity
           /entity
       /document
  and i get this stacktrace
  org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
  execute query: bytes Processing Document # 1
          at
 
 org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
          at
 
 org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253)
          at
 
 org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
          at
 
 org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
          at
 
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:98)
  It seems that whatever is in the url attribute it is trying to execute as
 a
  query. So i thought i put url=select bytes from documents where id =
  ${meta.ID} but then i get a classcastexception.
  Caused by: java.lang.ClassCastException:
  org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1
          at
 
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:98)
          at
 
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:233)
  Any ideas what is wrong with the config ?
  Thanks
  Jorg
  2010/1/27 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com
 
  There is no corresponding DataSurce which can be used with
  TikaEntityProcessor which reads from BLOB
  I have opened an issue.https://issues.apache.org/jira/browse/SOLR-1737
 
  On Mon, Jan 25, 2010 at 10:57 PM, Shah, Nirmal ns...@columnit.com
 wrote:
   Hi,
  
  
  
   I am fairly new to Solr and would like to use the DIH to pull rich
 text
   files (pdfs, etc) from BLOB fields in my database.
  
  
  
   There was a suggestion made to use the FieldReaderDataSource with the
   recently commited TikaEntityProcessor.  Has anyone accomplished this?
  
   This is my configuration, and the resulting error - I'm not sure if
 I'm
   using the FieldReaderDataSource correctly.  If anyone could shed light
   on whether I am going the right direction or not, it would be
   appreciated.
  
  
  
   ---Data-config.xml:
  
   dataConfig
  
     datasource name=f1 type=FieldReaderDataSource /
  
     dataSource name=orcle driver=oracle.jdbc.driver.OracleDriver
   url=jdbc:oracle:thin:un/p...@host:1521:sid /
  
        document
  
        entity dataSource=orcle name=attach query=select id as
 name,
   attachment from testtable2
  
           entity dataSource=f1 processor=TikaEntityProcessor
   dataField=attach.attachment format=text
  
              field column=text name=NAME /
  
           /entity
  
        /entity
  
     /document
  
   /dataConfig
  
  
  
  
  
   -Debug error:
  
   response
  
   lst name=responseHeader
  
   int name=status0/int
  
   int name=QTime203/int
  
   /lst
  
   lst name=initArgs
  
   lst name=defaults
  
   str name=configtestdb-data-config.xml/str
  
   /lst
  
   /lst
  
   str name=commandfull-import/str
  
   str name=modedebug/str
  
   null name=documents/
  
   lst name=verbose-output
  
   lst name=entity:attach
  
   lst name=document#1
  
   str name=queryselect id as name, attachment from testtable2/str
  
   str name=time-taken0:0:0.32/str
  
   str--- row #1-/str
  
   str name=NAMEjava.math.BigDecimal:2/str
  
   str name=ATTACHMENToracle.sql.BLOB:oracle.sql.b...@1c8e807/str
  
   str-/str
  
   lst name=entity:253433571801723
  
   str name=EXCEPTION
  
   org.apache.solr.handler.dataimport.DataImportHandlerException: No
   dataSource :f1 available for entity :253433571801723 Processing
 Document
   # 1
  
                  at
  
 org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(Da

Re: DataImportHandlerException for custom DIH Transformer

2010-02-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Mon, Feb 8, 2010 at 9:13 AM, Tommy Chheng tommy.chh...@gmail.com wrote:
  I'm having trouble making a custom DIH transformer in solr 1.4.

 I compiled the General TrimTransformer into a jar. (just copy/paste sample
 code from http://wiki.apache.org/solr/DIHCustomTransformer)
 I placed the jar along with the dataimporthandler jar in solr/lib (same
 directory as the jetty jar)

do not keep in solr/lib it wont work. keep it in {solr.home}/lib

 Then I added to my DIH data-config.xml file:
 transformer=DateFormatTransformer, RegexTransformer,
 com.chheng.dih.transformers.TrimTransformer

 Now I get this exception when I try running the import.
 org.apache.solr.handler.dataimport.DataImportHandlerException:
 java.lang.NoSuchMethodException:
 com.chheng.dih.transformers.TrimTransformer.transformRow(java.util.Map)
        at
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.loadTransformers(EntityProcessorWrapper.java:120)

 I noticed the exception lists TrimTransformer.transformRow(java.util.Map)
 but the abstract Transformer class defines a two parameter method:
 transformRow(MapString, Object row, Context context)?


 --
 Tommy Chheng
 Programmer and UC Irvine Graduate Student
 Twitter @tommychheng
 http://tommy.chheng.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: How to configure multiple data import types

2010-02-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
are you referring to nested entities?
http://wiki.apache.org/solr/DIHQuickStart#Index_data_from_multiple_tables_into_Solr

On Mon, Feb 8, 2010 at 5:42 PM,  stefan.ma...@bt.com wrote:
 I have got a dataimport request handler configured to index data by selecting 
 data from a DB view

 I now need to index additional data sets from other views so that I can 
 support other search queries

 I defined additional entity .. definitions within the document ..  
 section of my data-config.xml
 But I only seem to pull in data for the 1st entity ..  and not both


 Is there an xsd (or dtd) for
        data-config.xml
        schema.xml
        slrconfig.xml

 As these might help with understanding how to construct usable conf files

 Regards
 Stefan Maric
 BT Innovate  Design | Collaboration Platform - Customer Innovation Solutions




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DIH: delta-import not working

2010-02-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
try this

deltaImportQuery=select id, bytes from attachment where application =
 'MYAPP' and id = '${dataimporter.delta.id}'

be aware that the names are case sensitive . if the id comes as 'ID'
this will not work



On Tue, Feb 9, 2010 at 3:15 PM, Jorg Heymans jorg.heym...@gmail.com wrote:
 Hi,

 I am having problems getting the delta-import to work for my schema.
 Following what i have found in the list, jira and the wiki below
 configuration should just work but it doesn't.

 dataConfig
  dataSource name=ora driver=oracle.jdbc.OracleDriver
 url=jdbc:oracle:thin:@. user= password=/
  dataSource name=orablob type=FieldStreamDataSource /
  document name=mydocuments
    entity dataSource=ora name=attachment pk=id query=select id,
 bytes from attachment where application = 'MYAPP'
      deltaImportQuery=select id, bytes from attachment where application =
 'MYAPP' and id = '${dataimporter.attachment.id}'
      deltaQuery=select id from attachment where application = 'MYAPP' and
 modified_on gt; to_date('${dataimporter.attachment.last_index_time}',
 '-mm-dd hh24:mi:ss')
      field column=id name=attachmentId /
      entity dataSource=orablob processor=TikaEntityProcessor
 url=bytes dataField=attachment.bytes
        field column=text name=attachmentContents/
      /entity
    /entity
  /document
 /dataConfig

 The sql generated in the deltaquery is correct, the timestamp is passed
 correctly. When i execute that query manually in the DB it returns the pk of
 the rows that were added. However no documents are added to the index. What
 am i missing here ?? I'm using a build snapshot from 03/02.


 Thanks
 Jorg




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Call URL, simply parse the results using SolrJ

2010-02-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
you can also try

URL urlo = new URL(url);// ensure that the url has wt=javabin in that
NamedListObject namedList = new
JavaBinCodec().unmarshal(urlo.openConnection().getInputStream());
QueryResponse response = new QueryResponse(namedList, null);


On Mon, Feb 8, 2010 at 11:49 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 Here's what I did to resolve this:

 XMLResponseParser parser = new XMLResponseParser();
 URL urlo = new URL(url);
 InputStreamReader isr = new
 InputStreamReader(urlo.openConnection().getInputStream());
 NamedListObject namedList = parser.processResponse(isr);
 QueryResponse response = new QueryResponse(namedList, null);

 On Mon, Feb 8, 2010 at 10:03 AM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 So here's what happens if I pass in a URL with parameters, SolrJ chokes:

 Exception in thread main java.lang.RuntimeException: Invalid base
 url for solrj.  The base URL must not contain parameters:
 http://locahost:8080/solr/main/select?q=videoqt=dismax
        at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:205)
        at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:180)
        at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:152)
        at org.apache.solr.util.QueryTime.main(QueryTime.java:20)


 On Mon, Feb 8, 2010 at 9:32 AM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 Sorry for the poorly worded title... For SOLR-1761 I want to pass in a
 URL and parse the query response... However it's non-obvious to me how
 to do this using the SolrJ API, hence asking the experts here. :)






-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Solr 1.4: Full import FileNotFoundException

2010-02-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
concurrent imports are not allowed in DIH, unless u setup multiple DIH instances

On Sat, Feb 13, 2010 at 7:05 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I have noticed that when I run concurrent full-imports using DIH in Solr
 : 1.4, the index ends up getting corrupted. I see the following in the log

 I'm fairly confident that concurrent imports won't work -- but it
 shouldn't corrupt your index -- even if the DIH didn't actively check for
 this type of situation, the underlying Lucene LockFactory should ensure
 that one of the inports wins ... you'll need to tell us what kind of
 Filesystem you are using, and show us the relevent settings from your
 solrconfig (lock type, merge policy, indexDefaults, mainIndex, DIH,
 etc...)

 At worst you should get a lock time out exception.

 : But I looked at:
 : 
 http://old.nabble.com/dataimporthandler-and-multiple-delta-import-td19160129.html
 :
 : and was under the impression that this issue was fixed in Solr 1.4.

 ...right, attempting to run two concurrent imports with DIH should cause
 the second one to abort immediatley.




 -Hoss





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Solr 1.4: Full import FileNotFoundException

2010-02-13 Thread Noble Paul നോബിള്‍ नोब्ळ्
can we confirm that the user does not have multiple DIH configured?

any request for an import, while an import is going on, is rejected

On Sat, Feb 13, 2010 at 11:40 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : concurrent imports are not allowed in DIH, unless u setup multiple DIH 
 instances

 Right, but that's not the issue -- the question is wether attemping
 to do so might be causing index corruption (either because of a bug or
 because of some possibly really odd config we currently know nothing about)


 :  : I have noticed that when I run concurrent full-imports using DIH in Solr
 :  : 1.4, the index ends up getting corrupted. I see the following in the log
 : 
 :  I'm fairly confident that concurrent imports won't work -- but it
 :  shouldn't corrupt your index -- even if the DIH didn't actively check for
 :  this type of situation, the underlying Lucene LockFactory should ensure
 :  that one of the inports wins ... you'll need to tell us what kind of
 :  Filesystem you are using, and show us the relevent settings from your
 :  solrconfig (lock type, merge policy, indexDefaults, mainIndex, DIH,
 :  etc...)
 : 
 :  At worst you should get a lock time out exception.
 : 
 :  : But I looked at:
 :  : 
 http://old.nabble.com/dataimporthandler-and-multiple-delta-import-td19160129.html
 :  :
 :  : and was under the impression that this issue was fixed in Solr 1.4.
 : 
 :  ...right, attempting to run two concurrent imports with DIH should cause
 :  the second one to abort immediatley.
 : 
 : 
 : 
 : 
 :  -Hoss
 : 
 : 
 :
 :
 :
 : --
 : -
 : Noble Paul | Systems Architect| AOL | http://aol.com
 :



 -Hoss





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Preventing mass index delete via DataImportHandler full-import

2010-02-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Feb 17, 2010 at 8:03 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I have a small worry though. When I call the full-import functions, can
 : I configure Solr (via the XML files) to make sure there are rows to
 : index before wiping everything? What worries me is if, for some unknown
 : reason, we have an empty database, then the full-import will just wipe
 : the live index and the search will be broken.

 I believe if you set clear=false when doing the full-import, DIH won't
it is clean=false

or use command=import instead of command=full-import
 delete the entire index before it starts.  it probably makes the
 full-import slower (most of the adds wind up being deletes followed by
 adds) but it should prevent you from having an empty index if something
 goes wrong with your DB.

 the big catch is you now have to be responsible for managing deletes
 (using the XmlUpdateRequestHandler) yourself ... this bug looks like it's
 goal is to make this easier to deal with (but i'd not really clear to
 me what deletedPkQuery is ... it doesnt' seem to be documented.

 https://issues.apache.org/jira/browse/SOLR-1168



 -Hoss





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: @Field annotation support

2010-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
solrj jar

On Thu, Feb 18, 2010 at 10:52 PM, Pulkit Singhal
pulkitsing...@gmail.com wrote:
 Hello All,

 When I use Maven or Eclipse to try and compile my bean which has the
 @Field annotation as specified in http://wiki.apache.org/solr/Solrj
 page ... the compiler doesn't find any class to support the
 annotation. What jar should we use to bring in this custom Solr
 annotation?




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: replications issue

2010-02-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
wha is the problem. Is the replication not happening after you do a
commit on the master?
frequent polling is not a problem. frequent commits can slow down the system

On Fri, Feb 19, 2010 at 2:41 PM, giskard gisk...@autistici.org wrote:
 Ciao,

 Uhm after some time a new index in data/index on the slave has been written
 with the ~size of the master index.

 the configure on both master slave is the same one on the solrReplication 
 wiki page
 enable/disable master/slave in a node

 requestHandler name=/replication class=solr.ReplicationHandler 
  lst name=master
    str name=enable${enable.master:false}/str
    str name=replicateAftercommit/str
    str name=confFilesschema.xml,stopwords.txt/str
  /lst
  lst name=slave
    str name=enable${enable.slave:false}/str
   str name=masterUrlhttp://localhost:8983/solr/replication/str
   str name=pollInterval00:00:60/str
  /lst
 /requestHandler

 When the master is started, pass in -Denable.master=true and in the slave 
 pass in -Denable.slave=true. Alternately , these values can be stored in a 
 solrcore.properties file as follows

 #solrcore.properties in master
 enable.master=true
 enable.slave=false

 Il giorno 19/feb/2010, alle ore 03.43, Otis Gospodnetic ha scritto:

 giskard,

 Is this on the master or on the slave(s)?
 Maybe you can paste your replication handler config for the master and your 
 replication handler config for the slave.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/




 
 From: giskard gisk...@autistici.org
 To: solr-user@lucene.apache.org
 Sent: Thu, February 18, 2010 12:16:37 PM
 Subject: replications issue

 Hi all,

 I've setup solr replication as described in the wiki.

 when i start the replication a directory called index.$numebers is created 
 after a while
 it disappears and a new index.$othernumbers is created

 index/ remains untouched with an empty index.

 any clue?

 thank you in advance,
 Riccardo

 --
 ciao,
 giskard

 --
 ciao,
 giskard







-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: @Field annotation support

2010-02-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Feb 19, 2010 at 11:41 PM, Pulkit Singhal
pulkitsing...@gmail.com wrote:
 Ok then, is this the correct class to support the @Field annotation?
 Because I have it on the path but its not working.

yes , it is the right class. But, what is not working?
 org\apache\solr\solr-solrj\1.4.0\solr-solrj-1.4.0.jar/org\apache\solr\client\solrj\beans\Field.class

 2010/2/18 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 solrj jar

 On Thu, Feb 18, 2010 at 10:52 PM, Pulkit Singhal
 pulkitsing...@gmail.com wrote:
 Hello All,

 When I use Maven or Eclipse to try and compile my bean which has the
 @Field annotation as specified in http://wiki.apache.org/solr/Solrj
 page ... the compiler doesn't find any class to support the
 annotation. What jar should we use to bring in this custom Solr
 annotation?




 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Using XSLT with DIH for a URLDataSource

2010-02-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
The xslt file looks fine . is the location of the file correct ?

On Mon, Feb 22, 2010 at 2:57 PM, Roland Villemoes r...@alpha-solutions.dk 
wrote:

 Hi

 (thanks a lot)

 Yes, The full stacktrace is this:

 22-02-2010 08:37:00 org.apache.solr.handler.dataimport.DataImporter 
 doFullImport
 SEVERE: Full Import failed
 org.apache.solr.handler.dataimport.DataImportHandlerException: Error 
 initializing XSL  Processing Document # 1
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:103)
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:76)
        at 
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:71)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:319)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
        at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
        at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
        at 
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:203)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
        at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
        at java.lang.Thread.run(Thread.java:619)
 Caused by: javax.xml.transform.TransformerConfigurationException: Could not 
 compile stylesheet
        at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:825)
        at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:614)
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:98)
        ... 24 more
 22-02-2010 08:37:00 org.apache.solr.update.DirectUpdateHandler2 rollback


 My import feed (for testing is this):
 ?xml version='1.0' encoding='utf-8'?
 products
 product id='738' rank='10'
 brand id='48'![CDATA[World's Best]]/brandname![CDATA[Kontakt 
 Cream-Special 4 x 10]]/name
 categories primarycategory='17'
    category id='7'
        name![CDATA[Jeans  Bukser]]/name
        category id='17'
            name![CDATA[Jeans]]/name
        /category
    /category
    category id='8'
        name![CDATA[Nyheder]]/name
    /category
 /categories
 description![CDATA[4 pakker med 10 stk. glatte kondomer, med reservoir og 
 creme.]]/descriptionprice currency='SEK'310.70/pricesalesprice 
 currency='SEK'233.03/salespricecolor id='227'![CDATA[4 x 10 
 kondomer]]/colorsize 
 id='6'![CDATA[Large]]/sizeproductUrl![CDATA[http://www.website.se/butik/visvare.asp?id=738]]/productUrlimageUrl![CDATA[http://www.website.se/varebilleder/738_intro.jpg]]/imageUrllastmodified11-11-2008
  15:10:31/lastmodified/product
 product id='320' rank='10'
  categories primarycategory='17'
    category id='7'
      name![CDATA[Jeans  Bukser]]/name
      category id='17'
        name![CDATA[Jeans]]/name
      /category
    /category
    category id='8'
      name![CDATA[Nyheder]]/name
    /category
  /categories
  brand id='1'![CDATA[JBS]]/brandname![CDATA[JBS 
 trusser]]/namecategory 
 id='39'![CDATA[Trusser]]/categorydescription![CDATA[Gråmeleret JBS 
 trusser model Classic med gylp.]]/descriptionprice 
 currency='SEK'154.96/pricesalesprice 
 currency='SEK'154.96/salespricecolor 
 id='28'![CDATA[Gråmeleret]]/colorsize 
 

Re: error while using the DIH handler

2010-02-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
can you paste the DIH part in your solrconfig.xml ?

On Tue, Feb 23, 2010 at 7:01 PM, Na_D nabam...@zaloni.com wrote:

 yes i did check the location of the data-config.xml

 its in the folder example-DIH/solr/db/conf
 --
 View this message in context: 
 http://old.nabble.com/error-while-using-the-DIH-handler-tp27702772p2770.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Using XSLT with DIH for a URLDataSource

2010-02-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
you are right. The StreamSource class is not throwing the proper exception

Do we really have to handle this.?

On Thu, Feb 25, 2010 at 9:06 AM, Lance Norskog goks...@gmail.com wrote:
 [Taken off the list]

 The problem is that the XSLT code swallows the real exception, and
 does not return it as the deeper exception.  To show the right
 error, the code would open a file name or an URL directly. The problem
 is, the code has to throw an exception on a file or an URL and try the
 other, then decide what to do.

       try {
          URL u = new URL(xslt);
          iStream = u.openStream();
        } catch (MalformedURLException e) {
          iStream = new FileInputStream(new File(xslt));
        }
        TransformerFactory transFact = TransformerFactory.newInstance();
        xslTransformer = transFact.newTransformer(new StreamSource(iStream));


 On Mon, Feb 22, 2010 at 6:24 AM, Roland Villemoes r...@alpha-solutions.dk 
 wrote:
 You're right!

 I was as simple (stupid!) as that,

 Thanks a lot (for your time .. very appreciated)

 Roland

 -Oprindelig meddelelse-
 Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble 
 Paul ??? ??
 Sendt: 22. februar 2010 14:01
 Til: solr-user@lucene.apache.org
 Emne: Re: Using XSLT with DIH for a URLDataSource

 The xslt file looks fine . is the location of the file correct ?

 On Mon, Feb 22, 2010 at 2:57 PM, Roland Villemoes r...@alpha-solutions.dk 
 wrote:

 Hi

 (thanks a lot)

 Yes, The full stacktrace is this:

 22-02-2010 08:37:00 org.apache.solr.handler.dataimport.DataImporter 
 doFullImport
 SEVERE: Full Import failed
 org.apache.solr.handler.dataimport.DataImportHandlerException: Error 
 initializing XSL  Processing Document # 1
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:103)
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:76)
        at 
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:71)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:319)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
        at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
        at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
        at 
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:203)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
        at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
        at java.lang.Thread.run(Thread.java:619)
 Caused by: javax.xml.transform.TransformerConfigurationException: Could not 
 compile stylesheet
        at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:825)
        at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:614)
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:98)
        ... 24 more
 22-02-2010 08:37:00 org.apache.solr.update.DirectUpdateHandler2 rollback


 My import feed (for testing is this):
 ?xml version='1.0' encoding='utf-8'?
 products
 product id='738' rank='10'
 brand id='48'![CDATA[World's Best]]/brandname![CDATA[Kontakt 
 Cream-Special 4 x 10]]/name
 categories primarycategory='17'
    category id='7'
        name![CDATA[Jeans  Bukser]]/name
        category id='17'

Re: Using XSLT with DIH for a URLDataSource

2010-02-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
this is the only one place this should be a problem.'xsl' is not a
very commonly used attribute

On Fri, Feb 26, 2010 at 10:46 AM, Lance Norskog goks...@gmail.com wrote:
 There could be a common 'open an url' utility method. This would help
 make the DIH components consistent.

 2010/2/24 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 you are right. The StreamSource class is not throwing the proper exception

 Do we really have to handle this.?

 On Thu, Feb 25, 2010 at 9:06 AM, Lance Norskog goks...@gmail.com wrote:
 [Taken off the list]

 The problem is that the XSLT code swallows the real exception, and
 does not return it as the deeper exception.  To show the right
 error, the code would open a file name or an URL directly. The problem
 is, the code has to throw an exception on a file or an URL and try the
 other, then decide what to do.

       try {
          URL u = new URL(xslt);
          iStream = u.openStream();
        } catch (MalformedURLException e) {
          iStream = new FileInputStream(new File(xslt));
        }
        TransformerFactory transFact = TransformerFactory.newInstance();
        xslTransformer = transFact.newTransformer(new StreamSource(iStream));


 On Mon, Feb 22, 2010 at 6:24 AM, Roland Villemoes r...@alpha-solutions.dk 
 wrote:
 You're right!

 I was as simple (stupid!) as that,

 Thanks a lot (for your time .. very appreciated)

 Roland

 -Oprindelig meddelelse-
 Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble 
 Paul ??? ??
 Sendt: 22. februar 2010 14:01
 Til: solr-user@lucene.apache.org
 Emne: Re: Using XSLT with DIH for a URLDataSource

 The xslt file looks fine . is the location of the file correct ?

 On Mon, Feb 22, 2010 at 2:57 PM, Roland Villemoes 
 r...@alpha-solutions.dk wrote:

 Hi

 (thanks a lot)

 Yes, The full stacktrace is this:

 22-02-2010 08:37:00 org.apache.solr.handler.dataimport.DataImporter 
 doFullImport
 SEVERE: Full Import failed
 org.apache.solr.handler.dataimport.DataImportHandlerException: Error 
 initializing XSL  Processing Document # 1
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:103)
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:76)
        at 
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:71)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:319)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
        at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
        at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
        at 
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:203)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
        at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
        at java.lang.Thread.run(Thread.java:619)
 Caused by: javax.xml.transform.TransformerConfigurationException: Could 
 not compile stylesheet
        at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:825)
        at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:614)
        at 
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:98)
        ... 24 more
 22-02-2010 08:37:00

Re: If you could have one feature in Solr...

2010-02-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Feb 24, 2010 at 7:18 PM, Patrick Sauts patrick.via...@gmail.com wrote:
 Synchronisation between the slaves to switch the new index at the same time
 after replication.

I shall open as issue for this. And let us figure out how best it should be done
https://issues.apache.org/jira/browse/SOLR-1800


Re: replication issue

2010-03-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
The data/index.20100226063400 dir is a temporary dir and isc reated in
the same dir where the index dir is located.

I'm wondering if the symlink is causing the problem. Why don't you set
the data dir as /raid/data instead of /solr/data

On Sat, Feb 27, 2010 at 12:13 AM, Matthieu Labour
matthieu_lab...@yahoo.com wrote:
 Hi

 I am still having issues with the replication and wonder if things are 
 working properly

 So I have 1 master and 1 slave

 On the slave, I deleted the data/index directory and 
 data/replication.properties file and restarted solr.

 When slave is pulling data from master, I can see that the size of data 
 directory is growing

 r...@slr8:/raid/data# du -sh
 3.7M    .
 r...@slr8:/raid/data# du -sh
 4.7M    .

 and I can see that data/replication.properties  file got created and also a 
 directory data/index.20100226063400

 soon after index.20100226063400 disapears and the size of data/index is back 
 to 12K

 r...@slr8:/raid/data/index# du -sh
 12K    .

 And when I look for the number of documents via the admin interface, I still 
 see 0 documents so I feel something is wrong

 One more thing, I have a symlink for /solr/data --- /raid/data

 Thank you for your help !

 matt










-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: If you could have one feature in Solr...

2010-03-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Mar 5, 2010 at 4:34 AM, Mark Miller markrmil...@gmail.com wrote:
 On 03/04/2010 05:56 PM, Chris Hostetter wrote:

 : The ability to read solr configuration files from the classpath instead
 of
 : solr.solr.home directory.

 Solr has always supported this.

 When SolrResourceLoader.openResourceLoader is asked to open a resource it
 first checks if it's an absolute path -- if it's not then it checks
 relative the conf dir (under whatever the instanceDir is, ie: Solr Home
 in a single core setup), then it checks relative the current working dir
 and if it still can't find it it checks via the current ClassLoader.

 that said: it's not something that a lot of people have ever taken
 advantage of, so it wouldn't suprise me if some features in Solr are
 buggy because they try to open files directly w/o utilizing
 openResourceLoader -- in particular a quick test of the trunk example
 using...
 java -Djetty.class.path=./solr/conf -Dsolr.solr.home=/tmp/new-solr-home
 -jar start.jar

 ...seems to suggest that QueryElevationComponent isn't using openResource
 to look for elevate.xml  (i set solr.solr.home in that line so solr would
 *NOT* attempt to look at ./solr ... it does need some sort of Solr Home,
 but in this case it was a completley empty directory)


 -Hoss



 I've been trying to think of ways to tackle this. I hate getConfigDir - it
 lets anyone just get around the ResourceLoader basically.

 It would be awesome to get rid of it somehow - it would make
 ZooKeeperSolrResourceLoader so much easier to get working correctly across
 the board.
Why not just get rid of it? Components depending on filesystems is a
big headache.

 The main thing I'm hung up on is how to update a file - some code I've seen
 uses getConfigDir to update files eg you get the content of solrconfig, then
 you want to update it and reload the core. Most other things, I think are
 doable without getConfigDir.

 QueryElevationComponent is actually sort of simple to get around - we just
 need to add an exists method that return true/false if the resource exists.
 QEC just uses getConfigDir to a do an exists on the elevate.xml - if its not
 there, it looks in the data dir.

 --
 - Mark

 http://www.lucidimagination.com







-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Is it possible to use ODBC with DIH?

2010-03-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
if you have  a jdbc-odbc bridge driver , it should be fine

On Sun, Mar 7, 2010 at 4:52 AM, JavaGuy84 bbar...@gmail.com wrote:

 Hi,

 I have a ODBC driver with me for MetaMatrix DB(Redhat). I am trying to
 figure out a way to use DIH using the DSN which has been created in my
 machine with that ODBC driver?

 Is it possible to spcify a DSN in DIH and index the DB? if its possible, can
 you please let me know the ODBC URL that I need to enter for Datasource in
 DIH data-config.xml?

 Thanks,
 Barani
 --
 View this message in context: 
 http://old.nabble.com/Is-it-possible-to-use-ODBC-with-DIH--tp27808016p27808016.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: XPath Processing Applied to Clob

2010-03-17 Thread Noble Paul നോബിള്‍ नोब्ळ्
keep in mind that the xpath is case-sensitive. paste a sample xml

what is dataField=d.text  it does not seem to refer to anything.
where is the enclosing entity?
did you mean dataField=doc.text.

xpath=//BODY is a supported syntax as long as you are using Solr1.4 or higher




On Thu, Mar 18, 2010 at 3:15 AM, Neil Chaudhuri
nchaudh...@potomacfusion.com wrote:
 Incidentally, I tried adding this:

 datasource name=f type=FieldReaderDataSource /
 document
        entity dataSource=f processor=XPathEntityProcessor 
 dataField=d.text forEach=/MESSAGE
                  field column=body xpath=//BODY/
        /entity
 /document

 But this didn't seem to change anything.

 Any insight is appreciated.

 Thanks.



 From: Neil Chaudhuri
 Sent: Wednesday, March 17, 2010 3:24 PM
 To: solr-user@lucene.apache.org
 Subject: XPath Processing Applied to Clob

 I am using the DataImportHandler to index 3 fields in a table: an id, a date, 
 and the text of a document. This is an Oracle database, and the document is 
 an XML document stored as Oracle's xmltype data type. Since this is nothing 
 more than a fancy CLOB, I am using the ClobTransformer to extract the actual 
 XML. However, I don't want to index/store all the XML but instead just the 
 XML within a set of tags. The XPath itself is trivial, but it seems like the 
 XPathEntityProcessor only works for XML file content rather than the output 
 of a Transformer.

 Here is what I currently have that fails:


 document

        entity name=doc query=SELECT d.EFFECTIVE_DT, d.ARCHIVE_ID, 
 d.XML.getClobVal() AS TEXT FROM DOC d transformer=ClobTransformer

            field column=EFFECTIVE_DT name=effectiveDate /

            field column=ARCHIVE_ID name=id /

            field column=TEXT name=text clob=true
            entity name=text processor=XPathEntityProcessor 
 forEach=/MESSAGE url=${doc.text}
                field column=body xpath=//BODY/

            /entity

        /entity

 /document


 Is there an easy way to do this without writing my own custom transformer?

 Thanks.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: DIH best pratices question

2010-03-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Sat, Mar 27, 2010 at 3:25 AM, Blargy zman...@hotmail.com wrote:

 I have a items table on db1 and and item_descriptions table on db2.

 The items table is very small in the sense that it has small columns while
 the item_descriptions table has a very large text field column. Both tables
 are around 7 million rows

 What is the best way to import these into one document?

 document
  entity name=item ... 
     entity name=item_descriptions ... /entity
   /entity
 /document

this is the right way
 Or

 document
     entity name=item_descriptions rootEntity=false 
     entity name=item ... /entity
   /entity
 /document

 Or is there an alternative way? Maybe using the second way with a
 CachedSqlEntityProcessor for the item entity?
I don't think CachedSqlEntityProcessor helps here.

 Any thoughts are greatly appreciated. Thanks!
 --
 View this message in context: 
 http://n3.nabble.com/DIH-best-pratices-question-tp677568p677568.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: expungeDeletes on commit in Dataimport

2010-03-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Mar 25, 2010 at 10:14 PM, Ruben Chadien
ruben.chad...@aspiro.com wrote:
 Hi

 I know this has been discussed before, but is there any way do 
 expungeDeletes=true when the DataImportHandler does the commit.
 expungeDeletes= true is not used does not mean that the doc does not
get deleted.deleteDocByQuery does not do a commit. if you wish to
commit you should do it explicitly
 I am using the deleteDocByQuery in a Transformer when doing a delta-import 
 and as discussed before the documents are not deleted until restart.

 Also, how do i know in a Transformer if its running a Delta or Full Import , 
 i tries looking at Context. currentProcess() but that gives me FULL_DUMP 
 when doing a delta import...?
the variable ${dataimporter.request.command} tells you which command
is being run

 Thanks!
 Ruben Chadien



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: ReplicationHandler reports incorrect replication failures

2010-03-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
please create a bug

On Fri, Mar 26, 2010 at 7:29 PM, Shawn Smith ssmit...@gmail.com wrote:
 We're using Solr 1.4 Java replication, which seems to be working
 nicely.  While writing production monitors to check that replication
 is healthy, I think we've run into a bug in the status reporting of
 the ../solr/replication?command=details command.  (I know it's
 experimental...)

 Our monitor parses the replication?command=details XML and checks that
 replication lag is reasonable by diffing the indexVersion of the
 master and slave indices to make sure it's within a reasonable time
 range.

 Our monitor also compares the first elements of
 indexReplicatedAtList and replicationFailedAtList lists to see if
 the last replication attempt failed.  This is where we're having a
 problem with the monitor throwing false errors.  It looks like there's
 a bug that causes successful replications to be considered failures.
 The bug is triggered immediately after a slave restarts when the slave
 is already in sync with the master.  Each no-op replication attempt
 after restart is considered a failure until something on the master
 changes and replication has to actually do work.

 From the code, it looks like SnapPuller.successfulInstall starts out
 false on restart.  If the slave starts out in sync with the master,
 then each no-op replication poll leaves successfulInstall set to
 false which makes SnapPuller.logReplicationTimeAndConfFiles log the
 poll as a failure.  SnapPuller.successfulInstall stays false until the
 first time replication actually has to do something, at which point it
 gets set to true, and then everything is OK.

 Thanks,
 Shawn




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Solr indexing not taking all values from DB.

2008-10-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
The DIH status says 10 rows which means only 10 rows got fetched for
that query. Do you have any custom transformers which eats up rows?

Try the debug page of DIH and see what is happening to the rest of the rows.



On Fri, Oct 10, 2008 at 5:32 PM, con [EMAIL PROTECTED] wrote:

 A simple question:
 I performed the following steps to index data from a oracle db to solr index
 and then search:
 a) I have the configurations for indexing data from a oracle db
 b) started the server.
 c) Done a full-import:
 http://localhost:8983/solr/dataimport?command=full-import

 But when I do a search using http://localhost:8983/solr/select/?q=
 Not all the result sets that matches the search string are displayed.

 1) Is the above steps enough for getting db values to solr index?
 My configurations (data-config.xml and schema.xml )are quite correct because
 I am getting SOME of the result sets as search result(not all).
 2) Is there some value in sorconfig.xml, or some other files that limits the
 number of items being indexed? [For the time being I have only a few
 hundreds of records in my db. ]
 The query that I am specifying in data-config yields around 25 results if i
 execute it in a oracle client, where as the status of full-import is
 something like:
 str name=statusidle/str
 str name=importResponseConfiguration Re-loaded sucessfully/str
 lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched10/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-10-10 17:29:03/str
str name=Time taken 0:0:0.513/str
 /lst



 --
 View this message in context: 
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19916938.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: Solr indexing not taking all values from DB.

2008-10-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
template transformer does not eat up rows.

I am almost sure that the query returns only 10 rows in that case.
could you write a quick jdbc program and verify that (not the oralce
client)

everything else looks fine

On Sat, Oct 11, 2008 at 4:52 PM, con [EMAIL PROTECTED] wrote:

 Hi Noble
 Thanks for your reply

 In my data-config.xml I have;

entity name=employees transformer=TemplateTransformer  
 query=Select
 EMP_ID , EMP_NAME , NVL (COMMENT ,'-Nil-') as COMMENT  from EMPLOYEES
field column=rowtype template=employees /
field column=EMP_ID name=EMP_ID /
field column=EMP_NAME name=EMP_NAME /
field column=COMMENT name=COMMENT /
/entity

entity name=customers transformer=TemplateTransformer  
 query=Select
 CUST_ID , CUST_NAME , NVL (COMMENT ,'-Nil-') as COMMENT  from CUSTOMERS
field column=rowtype template=customers /
field column=CUST_ID name=CUST_ID /
field column=CUST_NAME name=CUST_NAME /
field column=COMMENT name=COMMENT /
/entity

 Whether this, TemplateTransformer, is the one that is restricting the
 resultset count to 10?
 Where can I find it out?
 I need this TemplateTransformer because I want to query the responses of
 either one of these at a time using the URL like,

 http://localhost:8983/solr/select/?q=(Bob%20AND%20rowtype:customers)version=2.2start=0rows=10indent=onwt=json

 I tried in the debug mode:
 (http://localhost:8983/solr/dataimport?command=full-importdebug=onverbose=on)
 , But it is not all mentioning anything after the 10th document.


 Thanks and regards
 con



 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 The DIH status says 10 rows which means only 10 rows got fetched for
 that query. Do you have any custom transformers which eats up rows?

 Try the debug page of DIH and see what is happening to the rest of the
 rows.



 On Fri, Oct 10, 2008 at 5:32 PM, con [EMAIL PROTECTED] wrote:

 A simple question:
 I performed the following steps to index data from a oracle db to solr
 index
 and then search:
 a) I have the configurations for indexing data from a oracle db
 b) started the server.
 c) Done a full-import:
 http://localhost:8983/solr/dataimport?command=full-import

 But when I do a search using http://localhost:8983/solr/select/?q=
 Not all the result sets that matches the search string are displayed.

 1) Is the above steps enough for getting db values to solr index?
 My configurations (data-config.xml and schema.xml )are quite correct
 because
 I am getting SOME of the result sets as search result(not all).
 2) Is there some value in sorconfig.xml, or some other files that limits
 the
 number of items being indexed? [For the time being I have only a few
 hundreds of records in my db. ]
 The query that I am specifying in data-config yields around 25 results if
 i
 execute it in a oracle client, where as the status of full-import is
 something like:
 str name=statusidle/str
 str name=importResponseConfiguration Re-loaded sucessfully/str
 lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched10/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-10-10 17:29:03/str
str name=Time taken 0:0:0.513/str
 /lst



 --
 View this message in context:
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19916938.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19931736.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: Solr indexing not taking all values from DB.

2008-10-13 Thread Noble Paul നോബിള്‍ नोब्ळ्
in debug mode it writes only 10 because there is a rows parameter
which is by default set to 10
make it 100 or so and you should be seeing all docs. But in non-debug
mode there is no such parameter

On Sun, Oct 12, 2008 at 11:00 PM, con [EMAIL PROTECTED] wrote:

 I wrote a jdbc program to implement the same query. But it is returning all
 the responses, 25 nos.
 But the solr is still indexing only 10 rows.
 Is there any optimization settings by default in the solrconfig.xml that
 restricts the responses to 10 ?
 thanks
 con.





 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 template transformer does not eat up rows.

 I am almost sure that the query returns only 10 rows in that case.
 could you write a quick jdbc program and verify that (not the oralce
 client)

 everything else looks fine

 On Sat, Oct 11, 2008 at 4:52 PM, con [EMAIL PROTECTED] wrote:

 Hi Noble
 Thanks for your reply

 In my data-config.xml I have;

entity name=employees transformer=TemplateTransformer
 query=Select
 EMP_ID , EMP_NAME , NVL (COMMENT ,'-Nil-') as COMMENT  from EMPLOYEES
field column=rowtype template=employees /
field column=EMP_ID name=EMP_ID /
field column=EMP_NAME name=EMP_NAME /
field column=COMMENT name=COMMENT /
/entity

entity name=customers transformer=TemplateTransformer
 query=Select
 CUST_ID , CUST_NAME , NVL (COMMENT ,'-Nil-') as COMMENT  from CUSTOMERS
field column=rowtype template=customers /
field column=CUST_ID name=CUST_ID /
field column=CUST_NAME name=CUST_NAME /
field column=COMMENT name=COMMENT /
/entity

 Whether this, TemplateTransformer, is the one that is restricting the
 resultset count to 10?
 Where can I find it out?
 I need this TemplateTransformer because I want to query the responses of
 either one of these at a time using the URL like,

 http://localhost:8983/solr/select/?q=(Bob%20AND%20rowtype:customers)version=2.2start=0rows=10indent=onwt=json

 I tried in the debug mode:
 (http://localhost:8983/solr/dataimport?command=full-importdebug=onverbose=on)
 , But it is not all mentioning anything after the 10th document.


 Thanks and regards
 con



 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 The DIH status says 10 rows which means only 10 rows got fetched for
 that query. Do you have any custom transformers which eats up rows?

 Try the debug page of DIH and see what is happening to the rest of the
 rows.



 On Fri, Oct 10, 2008 at 5:32 PM, con [EMAIL PROTECTED] wrote:

 A simple question:
 I performed the following steps to index data from a oracle db to solr
 index
 and then search:
 a) I have the configurations for indexing data from a oracle db
 b) started the server.
 c) Done a full-import:
 http://localhost:8983/solr/dataimport?command=full-import

 But when I do a search using http://localhost:8983/solr/select/?q=
 Not all the result sets that matches the search string are displayed.

 1) Is the above steps enough for getting db values to solr index?
 My configurations (data-config.xml and schema.xml )are quite correct
 because
 I am getting SOME of the result sets as search result(not all).
 2) Is there some value in sorconfig.xml, or some other files that
 limits
 the
 number of items being indexed? [For the time being I have only a few
 hundreds of records in my db. ]
 The query that I am specifying in data-config yields around 25 results
 if
 i
 execute it in a oracle client, where as the status of full-import is
 something like:
 str name=statusidle/str
 str name=importResponseConfiguration Re-loaded sucessfully/str
 lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched10/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-10-10 17:29:03/str
str name=Time taken 0:0:0.513/str
 /lst



 --
 View this message in context:
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19916938.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context:
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19931736.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19943817.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: Solr indexing not taking all values from DB.

2008-10-13 Thread Noble Paul നോബിള്‍ नोब्ळ्
now just do a normal full-import do not enable debug  . I guess it
should be just fine

On Mon, Oct 13, 2008 at 1:20 PM, con [EMAIL PROTECTED] wrote:

 Thanks Nobble
 I tried in the debug mode with rows=100 and it is accepting all the result
 sets.
 So i suppose there is nothing wrong in the query.
 But I am not able to update the index since this is available only in the
 debug mode.

 Can you please give some suggestions based on this.

 thanks
 con




 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 in debug mode it writes only 10 because there is a rows parameter
 which is by default set to 10
 make it 100 or so and you should be seeing all docs. But in non-debug
 mode there is no such parameter

 On Sun, Oct 12, 2008 at 11:00 PM, con [EMAIL PROTECTED] wrote:

 I wrote a jdbc program to implement the same query. But it is returning
 all
 the responses, 25 nos.
 But the solr is still indexing only 10 rows.
 Is there any optimization settings by default in the solrconfig.xml that
 restricts the responses to 10 ?
 thanks
 con.





 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 template transformer does not eat up rows.

 I am almost sure that the query returns only 10 rows in that case.
 could you write a quick jdbc program and verify that (not the oralce
 client)

 everything else looks fine

 On Sat, Oct 11, 2008 at 4:52 PM, con [EMAIL PROTECTED] wrote:

 Hi Noble
 Thanks for your reply

 In my data-config.xml I have;

entity name=employees transformer=TemplateTransformer
 query=Select
 EMP_ID , EMP_NAME , NVL (COMMENT ,'-Nil-') as COMMENT  from EMPLOYEES
field column=rowtype template=employees /
field column=EMP_ID name=EMP_ID /
field column=EMP_NAME name=EMP_NAME /
field column=COMMENT name=COMMENT /
/entity

entity name=customers transformer=TemplateTransformer
 query=Select
 CUST_ID , CUST_NAME , NVL (COMMENT ,'-Nil-') as COMMENT  from
 CUSTOMERS
field column=rowtype template=customers /
field column=CUST_ID name=CUST_ID /
field column=CUST_NAME name=CUST_NAME /
field column=COMMENT name=COMMENT /
/entity

 Whether this, TemplateTransformer, is the one that is restricting the
 resultset count to 10?
 Where can I find it out?
 I need this TemplateTransformer because I want to query the responses
 of
 either one of these at a time using the URL like,

 http://localhost:8983/solr/select/?q=(Bob%20AND%20rowtype:customers)version=2.2start=0rows=10indent=onwt=json

 I tried in the debug mode:
 (http://localhost:8983/solr/dataimport?command=full-importdebug=onverbose=on)
 , But it is not all mentioning anything after the 10th document.


 Thanks and regards
 con



 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 The DIH status says 10 rows which means only 10 rows got fetched for
 that query. Do you have any custom transformers which eats up rows?

 Try the debug page of DIH and see what is happening to the rest of the
 rows.



 On Fri, Oct 10, 2008 at 5:32 PM, con [EMAIL PROTECTED] wrote:

 A simple question:
 I performed the following steps to index data from a oracle db to
 solr
 index
 and then search:
 a) I have the configurations for indexing data from a oracle db
 b) started the server.
 c) Done a full-import:
 http://localhost:8983/solr/dataimport?command=full-import

 But when I do a search using
 http://localhost:8983/solr/select/?q=
 Not all the result sets that matches the search string are displayed.

 1) Is the above steps enough for getting db values to solr index?
 My configurations (data-config.xml and schema.xml )are quite correct
 because
 I am getting SOME of the result sets as search result(not all).
 2) Is there some value in sorconfig.xml, or some other files that
 limits
 the
 number of items being indexed? [For the time being I have only a few
 hundreds of records in my db. ]
 The query that I am specifying in data-config yields around 25
 results
 if
 i
 execute it in a oracle client, where as the status of full-import is
 something like:
 str name=statusidle/str
 str name=importResponseConfiguration Re-loaded sucessfully/str
 lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched10/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-10-10 17:29:03/str
str name=Time taken 0:0:0.513/str
 /lst



 --
 View this message in context:
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19916938.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context:
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19931736.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context:
 http://www.nabble.com/Solr-indexing-not-taking-all-values-from-DB.-tp19916938p19943817.html

Re: Need Help, Can I query the index from command line

2008-10-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
see an example here
http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476

On Tue, Oct 14, 2008 at 9:17 PM, Erik Hatcher
[EMAIL PROTECTED] wrote:
 Solr's new DataImportHandler can index RSS (and Atom should be fine too)
 feeds.

Erik

 On Oct 14, 2008, at 11:37 AM, msizec wrote:


 Thank you for your help.

 I've just realized that Solr could not index pages from the web.

 I wonder if someone of you guys would know another open source search tool
 that could do this job : indexing pages (rss, atom feeds) from an urls
 list
 ant let me query it from the command line so that I could know, in a
 script,
 which feed contains a keyword.


 Don't know if you will understand what I mean, but I hope so !

 Thanks !
 --
 View this message in context:
 http://www.nabble.com/Need-Help%2C-Can-I-query-the-index-from-command-line-tp19974279p19975753.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: error with delta import

2008-10-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
the query makes my head spin .
joining on an sql does not enable you to populate multivalued fields .
Otherwise , it is all fine

pk attribute is missing in the entity

On Tue, Oct 14, 2008 at 6:16 PM, Florian Aumeier
[EMAIL PROTECTED] wrote:
 Noble Paul നോബിള്‍ नोब्ळ् schrieb:

 apparently you have not specified the deltaQuery attribute in the entity.
  Check the delta-import section in the wiki
 http://wiki.apache.org/solr/DataImportHandler
 or you can share your data-config file and we can take a quick look



 here is my data-config. I configured both, the deltaQuery and query entity
 in one data-config. Is this the correct usecase?
 Also, I found it easier to join the document on the database level instead
 of leaving it to solr.

 dataConfig
 dataSource type=JdbcDataSource driver=org.postgresql.Driver
 url=jdbc:postgresql://bm02:5432/bm user=user /

 document name=articles
 entity name=articles deltaQuery=SELECT a.id AS article_id,a.stub AS
 article_stub,a.ref AS article_ref,a.id_blogs,a.title AS article_title,
 a.normalized_text, au.url AS article_url, bu.url AS blog_url, b.title AS
 blog_title,b.subtitle AS blog_subtitle, r.rank,
 coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs b
 on a.id_blogs = b.id join urls au on a.id_urls = au.id join urls bu on
 b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id = r.id_articles WHERE
 b.id_urls is not null AND a.hidden is false AND b.hidden is false AND a.ref
 is not null AND b.ref is not null AND (rankid in (SELECT rankid FROM ranks
 order by rankid desc limit 1) OR rankid is null) AND
 coalesce(a.updated,a.published,a.added) gt;
 '${dataimporter.last_index_time}'
 query=SELECT a.id AS article_id,a.stub AS article_stub,a.ref AS
 article_ref,a.id_blogs,a.title AS article_title, a.normalized_text, au.url
 AS article_url, bu.url AS blog_url, b.t\
 itle AS blog_title,b.subtitle AS blog_subtitle, r.rank,
 coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs b
 on a.id_blogs = b.id join urls au on a.id_urls = au\
 .id join urls bu on b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id =
 r.id_articles WHERE b.id_urls is not null AND a.hidden is false AND b.hidden
 is false AND a.ref is not null AN\
 D b.ref is not null AND (rankid in (SELECT rankid FROM ranks order by rankid
 desc limit 1) OR rankid is null) AND
 coalesce(a.updated,a.published,a.added)
 field column=article_id name=a_id /
 field column=normalized_text name=norm_text /
 field column=article_ref name=id /
 field column=article_stub name=stub /
 field column=id_blogs name=blog_id /
 field column=article_title name=a_title /
 field column=article_url name=article_url /
 field column=ts name=ts /
 field column=rank name=rank /
 field column=blog_ref name=blog_ref /
 field column=blog_title name=b_title /
 field column=blog_subtitle name=subtitle /

 field column=blog_url name=blog_url /

 /entity

 /document

 /dataConfig

 Florian





-- 
--Noble Paul


Re: error with delta import

2008-10-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
The delta implementation is a bit fragile in DIH for complex queries

I recommend you do delta-import using a full-import

it can be done as follows
define a diffferent entity

dataConfig
dataSource type=JdbcDataSource driver=org.postgresql.Driver
url=jdbc:postgresql://bm02:5432/bm user=user /

document name=articles
  entity name=articles-full ..
  /entity

  entity name=articles-delta rootEntity=false
query=your-delta-query-here
  !-- this following entity can be a copy articles-full entity
without any delta query because rootEntity=false for
   articles-delta the following will be used for creating
documents. all other rules are same--
   entity name=anyname ..
   /entity
 /entity
/document

when you wish to do a full-import pass the request parameter
entity=articles-full

for delta-import use the request parameter
entity=articles-deltaclean=false (command has to be full-import only)



On Wed, Oct 15, 2008 at 1:42 PM, Florian Aumeier
[EMAIL PROTECTED] wrote:
 Shalin Shekhar Mangar schrieb:

 You are missing the pk field (primary key). This is used for delta
 imports.


 I added the pk field and rebuild the index yesterday. However, when I run
 the delta-import, I still have this error message in the log:

 INFO: Starting delta collection.
 Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Running ModifiedRowKey() for Entity: articles
 Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Creating a connection for entity articles with URL:
 jdbc:postgresql://bm02:5432/bm
 Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Time taken for getConnection(): 43
 Oct 15, 2008 9:37:36 AM org.apache.solr.core.SolrCore execute
 INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
 Oct 15, 2008 9:44:51 AM org.apache.solr.core.SolrCore execute
 INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
 Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 4584
 Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Running DeletedRowKey() for Entity: articles
 Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
 Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed parentDeltaQuery for Entity: articles
 Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DataImporter
 doDeltaImport
 SEVERE: Delta Import Failed
 java.lang.NullPointerException
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
   at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
   at
 org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
   at
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
   at
 org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
   at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
   at
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
 Oct 15, 2008 9:50:58 AM org.apache.solr.core.SolrCore execute
 INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0

 Regards
 Florian




-- 
--Noble Paul


Re: error with delta import

2008-10-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
the last-index_time is available only from second time onwards that is
. It expects a full-import to be done first
It knows that by the presence of dataimport.properties in the  config
directory. Did you check if it is present?


On Thu, Oct 16, 2008 at 5:33 PM, Florian Aumeier
[EMAIL PROTECTED] wrote:
 Noble Paul നോബിള്‍ नोब्ळ् schrieb:

 Well, when doing the way you described below (full-import with the delta
 query), the '${dataimporter.last_index_time}' timestamp is empty:


 I guess this was fixed post 1.3 . probably you can take
 dataimporthandler.jar from a nightly build (you may also need to add
 slf4j.jar)


 I replaced
 dist/apache-solr-dataimporthandler-1.3.0.jar
 dist/solrj-lib/slf4j-api-1.5.3.jar
 dist/solrj-lib/slf4j-jdk14-1.5.3.jar

 with their counterparts from the nightly build, but it did not help. Then I
 tried to enter the date kind of hard coded (now() - '12 hours'::interval).
 Everything looks fine, but there are no new documents in the index.

 here is the log:

 INFO: Starting Full Import
 Oct 16, 2008 1:07:08 PM org.apache.solr.core.SolrCore executeINFO: [test]
 webapp=/solr path=/dataimport
 params={command=full-importclean=falseentity=articles-delta} status=0
 QTime=0
 Oct 16, 2008 1:07:08 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Creating a connection for entity articles-delta with URL:
 jdbc:postgresql://bm02:5432/bm
 Oct 16, 2008 1:07:08 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
 callINFO: Time taken for getConnection(): 45
 Oct 16, 2008 1:14:53 PM org.apache.solr.core.SolrCore execute
 INFO: [test] webapp=/solr path=/dataimport params={} status=0 QTime=1
 Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.SolrWriter
 readIndexerPropertiesINFO: Read dataimport.properties
 Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.SolrWriter
 persistStartTime
 INFO: Wrote last indexed time to dataimport.properties
 Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.DocBuilder
 commitINFO: Full Import completed successfullyOct 16, 2008 1:16:11 PM
 org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)Oct 16,
 2008 1:16:11 PM org.apache.solr.search.SolrIndexSearcher initINFO: Opening
 [EMAIL PROTECTED] mainOct 16, 2008 1:16:11 PM
 org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: end_commit_flush
 ... (autowarming)
 Oct 16, 2008 1:16:12 PM org.apache.solr.handler.dataimport.DocBuilder
 execute
 INFO: Time taken = 0:9:3.231





-- 
--Noble Paul


Re: dataimport, both splitBy and dateTimeFormat

2008-10-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
Thanks David,
I have updated the wiki documentation
http://wiki.apache.org/solr/DataImportHandler#transformer

The default transformers do not have any special privilege it is like
any normal user provided transformer.We just identified some commonly
found usecases and added transformers for that.

 Applying a transformer is not very 'cheap' it has to do extra checks
to know whether to apply or not.

On Fri, Oct 17, 2008 at 12:26 AM, David Smiley @MITRE.org
[EMAIL PROTECTED] wrote:

 The wiki didn't mention I can specify multiple transformers.  BTW, it's
 transformer (singular), not transformers.  I did mean both NFT and DFT
 because I was speaking of the general case, not just mine in particular.  I
 thought that the built-in transformers were always in-effect and so I
 expected NFT,DFT to occur last.  Sorry if I wasn't clear.

 Thanks for your help; it worked.

 ~ David


 Shalin Shekhar Mangar wrote:

 Hi David,

 I think you meant RegexTransformer instead of NumberFormatTransformer.
 Anyhow, the order in which the transformers are applied is the same as the
 order in which you specify them.

 So make sure your entity has
 transformers=RegexTransformer,DateFormatTransformer.

 On Thu, Oct 16, 2008 at 6:14 PM, David Smiley @MITRE.org
 [EMAIL PROTECTED]wrote:


 I'm trying out the dataimport capability.  I have a column that is a
 series
 of dates separated by spaces like so:
 1996-00-00 1996-04-00
 And I'm trying to import it like so:
 field column=r_event_date splitBy=  dateTimeFormat=-MM-dd /

 However this fails and the stack trace suggests it is first trying to
 apply
 the dateTimeFormat before splitBy.  I think this is a bug... dataimport
 should apply DateFormatTransformer and NumberFormatTransformer last.

 ~ David Smiley
 --
 View this message in context:
 http://www.nabble.com/dataimport%2C-both-splitBy-and-dateTimeFormat-tp20013006p20013006.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 Regards,
 Shalin Shekhar Mangar.


 --
 View this message in context: 
 http://www.nabble.com/dataimport%2C-both-splitBy-and-dateTimeFormat-tp20013006p20016178.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: RegexTransformer debugging (DIH)

2008-10-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
If it is a normal exception it is logged with the number of document
where it failed and you can put it on debugger with start=x-1rows=1

We do not catch a throwable or Error so it gets slipped through.

if you are adventurous enough wrap the RegexTranformer with your own
and apply that say transformer=my.ReegexWrapper and catch a
throwable and print out the row.




On Thu, Oct 16, 2008 at 9:49 PM, Jon Baer [EMAIL PROTECTED] wrote:
 Is there a way to prevent this from occurring (or a way to nail down the doc
 which is causing it?):

 INFO: [news] webapp=/solr path=/admin/dataimport params={command=status}
 status=0 QTime=0
 Exception in thread Thread-14 java.lang.StackOverflowError
at java.util.regex.Pattern$Single.match(Pattern.java:3313)
at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4763)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
at java.util.regex.Pattern$All.match(Pattern.java:4079)
at java.util.regex.Pattern$Branch.match(Pattern.java:4538)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4578)
at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4767)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
at java.util.regex.Pattern$All.match(Pattern.java:4079)
at java.util.regex.Pattern$Branch.match(Pattern.java:4538)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4578)
at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4767)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
at java.util.regex.Pattern$All.match(Pattern.java:4079)

 Thanks.

 - Jon





-- 
--Noble Paul


Re: Different XML format for multi-valued fields?

2008-10-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
The component that writes out the values do not know if it is
multivalued or not. So if it finds only a single value it writes it
out as such


On Thu, Oct 16, 2008 at 10:52 PM, oleg_gnatovskiy
[EMAIL PROTECTED] wrote:

 Hello. I have an index built in Solr with several multi-value fields. When
 the multi-value field has only one value for a document, the XML returned
 looks like this:
 arr name=someIds
 long name=someIds5693/long
 /arr
 However, when there are multiple values for the field, the XMl looks like
 this:
 arr name=someIds
 long11199/long
 long1722/long
 /arr
 Is there a reason for this difference? Also, how does faceting work with
 multi-valued fields? It seems that I sometimes get facet results from
 multi-valued fields, and sometimes I don't.

 Thanks.
 --
 View this message in context: 
 http://www.nabble.com/Different-XML-format-for-multi-valued-fields--tp20015951p20015951.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: Solr search not displaying all the indexed values.

2008-10-17 Thread Noble Paul നോബിള്‍ नोब्ळ्
how do you know that indexing is fine?  does a query of *:* give all
the results you wanted?

On Thu, Oct 16, 2008 at 3:58 PM, con [EMAIL PROTECTED] wrote:


 Yes. something similar to :

 entity name=sample1  transformer=TemplateTransformer pk=userID
 query=select * from USER, CUSTOMER where USER.userID = CUSTOMER.userID
field column=rowtype template=sample1 /
field column=userID name=userID /
 /entity

 entity name=sample2  transformer=TemplateTransformer pk=userID
 query=select * from USER , MANAGER  where USER.desig = MANAGER.desig
field column=rowtype template=sample2 /
field column=userID name=userID /
 /entity

 But the searching will not give all the results even if there is only one
 result. whereas indexing is fine.
 Thanks
 con


 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 do you have 2 queries in 2 different entities?


 On Thu, Oct 16, 2008 at 3:17 PM, con [EMAIL PROTECTED] wrote:

 I have two queries in my data-config.xml which takes values from multiple
 tables, like:
 select * from EMPLOYEE, CUSTOMER where EMPLOYEE.prod_id=
 CUSTOMER.prod_id.

 When i do a full-import it is indexing all the rows as expected.

 But when i search it with a *:* , it is not displaying all the values.
 Do I need any extra configurations?

 Thanks
 con
 --
 View this message in context:
 http://www.nabble.com/Solr-search-not-displaying-all-the-indexed-values.-tp20010401p20010401.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/Solr-search-not-displaying-all-the-indexed-values.-tp20010401p20010927.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: Solr search not displaying all the indexed values.

2008-10-17 Thread Noble Paul നോബിള്‍ नोब्ळ्
This is to debug your problem.
remove the unniqueKey and run import.
just see if all the docs are shown .
If yes , then you have duplicate ids which cause some documents to be removed





On Fri, Oct 17, 2008 at 2:01 PM, con [EMAIL PROTECTED] wrote:

 The responce that i get while executing
 http://localhost:8983/solr/core0/dataimport?command=full-import
 shows that all the rows that are expected to be the output of that query is
 getting indexed.
 The count,
 str name=
Indexing completed. Added/Updated: 19 documents. Deleted 0 documents.
 /str
 is as expected.

 But when i invoke a *:* it is displaying only 9 records.

 Similarly, For another entity that indexes around 500 records, a *:* gives
 only 4 responces.

 Why this inconsistency ? how can I fix it before deploying it in actual
 production.
 Thanks
 con


 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 how do you know that indexing is fine?  does a query of *:* give all
 the results you wanted?

 On Thu, Oct 16, 2008 at 3:58 PM, con [EMAIL PROTECTED] wrote:


 Yes. something similar to :

 entity name=sample1  transformer=TemplateTransformer pk=userID
 query=select * from USER, CUSTOMER where USER.userID = CUSTOMER.userID
field column=rowtype template=sample1 /
field column=userID name=userID /
 /entity

 entity name=sample2  transformer=TemplateTransformer pk=userID
 query=select * from USER , MANAGER  where USER.desig = MANAGER.desig
field column=rowtype template=sample2 /
field column=userID name=userID /
 /entity

 But the searching will not give all the results even if there is only one
 result. whereas indexing is fine.
 Thanks
 con


 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 do you have 2 queries in 2 different entities?


 On Thu, Oct 16, 2008 at 3:17 PM, con [EMAIL PROTECTED] wrote:

 I have two queries in my data-config.xml which takes values from
 multiple
 tables, like:
 select * from EMPLOYEE, CUSTOMER where EMPLOYEE.prod_id=
 CUSTOMER.prod_id.

 When i do a full-import it is indexing all the rows as expected.

 But when i search it with a *:* , it is not displaying all the values.
 Do I need any extra configurations?

 Thanks
 con
 --
 View this message in context:
 http://www.nabble.com/Solr-search-not-displaying-all-the-indexed-values.-tp20010401p20010401.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context:
 http://www.nabble.com/Solr-search-not-displaying-all-the-indexed-values.-tp20010401p20010927.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/Solr-search-not-displaying-all-the-indexed-values.-tp20010401p20029228.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: MySql - Solr 1.3 - Full import, how to make request pack smaller?

2008-10-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
do you have a nested entities? Then there is chance of firing too many
requests to MySql.

If you have nested entities try using CachedSqlEntityProcessor for the
inner ones  (only for the inner ones). i am assuming you have enough
RAM to support this
--Noble

On Mon, Oct 20, 2008 at 3:13 PM, sunnyfr [EMAIL PROTECTED] wrote:

 Hi
 I don't have a problem of memory but It's a production database and I stuck
 other service on it because I've too much request on it, how can I make it
 maybe longer but taking less resources of MySql?
 Thanks a lot,


 --
 View this message in context: 
 http://www.nabble.com/MySql---Solr-1.3---Full-import%2C-how-to-make-request-pack-smaller--tp20066186p20066186.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: error with delta import

2008-10-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar
[EMAIL PROTECTED] wrote:
 Your data-config looks fine except for one thing -- you do not need to
 escape '' character in an XML attribute. It maybe throwing off the parsing
 code in DataImportHandler.
not really '' is fine in attribute


 Another question, does the full-import work fine?

 On Mon, Oct 20, 2008 at 7:31 PM, Florian Aumeier
 [EMAIL PROTECTED]wrote:

 sorry to bother you again, but the delta import still does not work for me
 :-(

 We tried:
 * delta-import by full-import
   entity name=articles-delta rootEntity=false
 query=your-delta-query-here with entity=articles-deltaclean=false

 * delta-import by full-import with simplified query

 * delta-import with simplified query
   entity name=articles-delta pk=article_ref deltaQuery=SELECT *
 FROM full_text_view WHERE article_id lt; 300

 * replaced files below with files from nightly-build 15.10.08 and rerun the
 delta and full imports as described above
 dist/apache-solr-dataimporthandler-1.3.0.jar
 dist/solrj-lib/slf4j-api-1.5.3.jar
 dist/solrj-lib/slf4j-jdk14-1.5.3.jar


 No matter what we do, we always end up in a situation, when the dataimport
 status looks fine:

 lst name=statusMessages
 str name=Time Elapsed0:0:8.442/str
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched218/str
 str name=Total Documents Skipped0/str
 str name=Delta Dump started2008-10-20 15:31:54/str
 str name=Identifying Delta2008-10-20 15:31:54/str
 str name=Deltas Obtained2008-10-20 15:31:57/str
 str name=Building documents2008-10-20 15:31:57/str
 str name=Total Changed Documents218/str

 but the log reads:
 Oct 20, 2008 3:56:44 PM org.apache.solr.core.SolrCore execute
 INFO: [test] webapp=/solr path=/dataimport params={command=delta-import}
 status=0 QTime=0
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DataImporter
 doDeltaImport
 INFO: Starting Delta Import
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.SolrWriter
 readIndexerProperties
 INFO: Read dataimport.properties
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
 doDelta
 INFO: Starting delta collection.
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Running ModifiedRowKey() for Entity: articles-full
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Creating a connection for entity articles-full with URL:
 jdbc:postgresql://blogmonitor02:5432/blogmonitor
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Time taken for getConnection(): 5
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed ModifiedRowKey for Entity: articles-full rows obtained :
 218
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Running DeletedRowKey() for Entity: articles-full
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed DeletedRowKey for Entity: articles-full rows obtained : 0
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed parentDeltaQuery for Entity: articles-full
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DataImporter
 doDeltaImport
 SEVERE: Delta Import Failed
 java.lang.NullPointerException
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
   at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
   at
 org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
   at
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
   at
 org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
   at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
   at
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)

 here is the full data-config:

 dataConfig
  dataSource type=JdbcDataSource driver=org.postgresql.Driver
   url=jdbc:postgresql://bm02:5432/bm user=bm /

  document name=articles
   entity name=articles-full pk=id query=SELECT * FROM full_text_view
 where article_id lt; 200 deltaQuery=SELECT * FROM full_text_view WHERE
 article_id lt; 300
 field column=article_id name=a_id /
 field column=normalized_text name=norm_text /
 field column=article_ref name=id /
 field column=article_stub name=stub /
 field column=id_blogs name=blog_id /
 field column=article_title name=a_title /
 field column=article_url name=article_url /
 field column=ts name=ts /
 field column=rank name=rank /
   field 

Re: error with delta import

2008-10-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
you are still doing a delta import . with the modified data-config you
must do a command=full-import


On Mon, Oct 20, 2008 at 7:31 PM, Florian Aumeier
[EMAIL PROTECTED] wrote:
 sorry to bother you again, but the delta import still does not work for me
 :-(

 We tried:
 * delta-import by full-import
   entity name=articles-delta rootEntity=false
 query=your-delta-query-here with entity=articles-deltaclean=false

 * delta-import by full-import with simplified query

 * delta-import with simplified query
   entity name=articles-delta pk=article_ref deltaQuery=SELECT *
 FROM full_text_view WHERE article_id lt; 300

 * replaced files below with files from nightly-build 15.10.08 and rerun the
 delta and full imports as described above
 dist/apache-solr-dataimporthandler-1.3.0.jar
 dist/solrj-lib/slf4j-api-1.5.3.jar
 dist/solrj-lib/slf4j-jdk14-1.5.3.jar


 No matter what we do, we always end up in a situation, when the dataimport
 status looks fine:

 lst name=statusMessages
 str name=Time Elapsed0:0:8.442/str
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched218/str
 str name=Total Documents Skipped0/str
 str name=Delta Dump started2008-10-20 15:31:54/str
 str name=Identifying Delta2008-10-20 15:31:54/str
 str name=Deltas Obtained2008-10-20 15:31:57/str
 str name=Building documents2008-10-20 15:31:57/str
 str name=Total Changed Documents218/str

 but the log reads:
 Oct 20, 2008 3:56:44 PM org.apache.solr.core.SolrCore execute
 INFO: [test] webapp=/solr path=/dataimport params={command=delta-import}
 status=0 QTime=0
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DataImporter
 doDeltaImport
 INFO: Starting Delta Import
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.SolrWriter
 readIndexerProperties
 INFO: Read dataimport.properties
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
 doDelta
 INFO: Starting delta collection.
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Running ModifiedRowKey() for Entity: articles-full
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Creating a connection for entity articles-full with URL:
 jdbc:postgresql://blogmonitor02:5432/blogmonitor
 Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
 call
 INFO: Time taken for getConnection(): 5
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed ModifiedRowKey for Entity: articles-full rows obtained : 218
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Running DeletedRowKey() for Entity: articles-full
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed DeletedRowKey for Entity: articles-full rows obtained : 0
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
 collectDelta
 INFO: Completed parentDeltaQuery for Entity: articles-full
 Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DataImporter
 doDeltaImport
 SEVERE: Delta Import Failed
 java.lang.NullPointerException
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
   at
 org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
   at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
   at
 org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
   at
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
   at
 org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
   at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
   at
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)

 here is the full data-config:

 dataConfig
  dataSource type=JdbcDataSource driver=org.postgresql.Driver
   url=jdbc:postgresql://bm02:5432/bm user=bm /

  document name=articles
   entity name=articles-full pk=id query=SELECT * FROM full_text_view
 where article_id lt; 200 deltaQuery=SELECT * FROM full_text_view WHERE
 article_id lt; 300
 field column=article_id name=a_id /
 field column=normalized_text name=norm_text /
 field column=article_ref name=id /
 field column=article_stub name=stub /
 field column=id_blogs name=blog_id /
 field column=article_title name=a_title /
 field column=article_url name=article_url /
 field column=ts name=ts /
 field column=rank name=rank /
   field column=blog_ref name=blog_ref /
   field column=blog_title name=b_title /
   field column=blog_subtitle name=subtitle /
 field column=blog_url name=blog_url /
 /entity

   /document

 /dataConfig

 what are we doing wrong?
 Florian



Re: MySql - Solr 1.3 - Full import, how to make request pack smaller?

2008-10-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
nested entity means one entity inside othere as follows
entity name=e1
  ...
   entity name=e2
...
   /entity
/entity

In this case for each row in e1 one query is executed on e2



On Mon, Oct 20, 2008 at 6:01 PM, sunnyfr [EMAIL PROTECTED] wrote:

 sorry but bested entities is not an expression that i know, I'm french, what
 does that means ... is it when in one request you have several table and
 inner join between them ?


 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 do you have a nested entities? Then there is chance of firing too many
 requests to MySql.

 If you have nested entities try using CachedSqlEntityProcessor for the
 inner ones  (only for the inner ones). i am assuming you have enough
 RAM to support this
 --Noble

 On Mon, Oct 20, 2008 at 3:13 PM, sunnyfr [EMAIL PROTECTED] wrote:

 Hi
 I don't have a problem of memory but It's a production database and I
 stuck
 other service on it because I've too much request on it, how can I make
 it
 maybe longer but taking less resources of MySql?
 Thanks a lot,


 --
 View this message in context:
 http://www.nabble.com/MySql---Solr-1.3---Full-import%2C-how-to-make-request-pack-smaller--tp20066186p20066186.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/MySql---Solr-1.3---Full-import%2C-how-to-make-request-pack-smaller--tp20066186p20067466.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: function to clear up string to utf8 before indexing, where should I put it?

2008-10-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
you can try out a Transformer to translate that

On Wed, Oct 22, 2008 at 2:00 PM, sunnyfr [EMAIL PROTECTED] wrote:

 I've a function to clear up string which are in latin1 to UTF8, I would like
 to know where exactly should I put it in the java code to clear up string
 before indexing ?

 Thanks a lot for this information,
 Sunny

 I'm using solr1.3, mysql, tomcat55
 --
 View this message in context: 
 http://www.nabble.com/function-to-clear-up-string-to-utf8-before-indexing%2C-where-should-I-put-it--tp20106224p20106224.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: function to clear up string to utf8 before indexing, where should I put it?

2008-10-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
http://wiki.apache.org/solr/DataImportHandler#head-eb523b0943596587f05532f3ebc506ea6d9a606b

On Wed, Oct 22, 2008 at 4:41 PM, sunnyfr [EMAIL PROTECTED] wrote:

 Can you tell me more about it ?


 Noble Paul നോബിള്‍ नोब्ळ् wrote:

 you can try out a Transformer to translate that

 On Wed, Oct 22, 2008 at 2:00 PM, sunnyfr [EMAIL PROTECTED] wrote:

 I've a function to clear up string which are in latin1 to UTF8, I would
 like
 to know where exactly should I put it in the java code to clear up string
 before indexing ?

 Thanks a lot for this information,
 Sunny

 I'm using solr1.3, mysql, tomcat55
 --
 View this message in context:
 http://www.nabble.com/function-to-clear-up-string-to-utf8-before-indexing%2C-where-should-I-put-it--tp20106224p20106224.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/function-to-clear-up-string-to-utf8-before-indexing%2C-where-should-I-put-it--tp20106224p20108569.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: error with delta import

2008-10-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
The case in point is DIH. DIH uses the standard DOM parser that comes
w/ JDK. If it reads the xml properly do we need to complain?.  I guess
that data-config.xml may not be used for any other purposes.


On Wed, Oct 22, 2008 at 10:10 PM, Walter Underwood
[EMAIL PROTECTED] wrote:
 On 10/22/08 8:57 AM, Steven A Rowe [EMAIL PROTECTED] wrote:

 Telling people that it's not a problem (or required!) to write 
 non-well-formed
 XML, because a particular XML parser can't accept well-formed XML is kind of
 insidious.

 I'm with you all the way on this.

 A parser which accepts non-well-formed XML is not an XML parser, since the
 XML spec requires reporting a fatal error.

 It is really easy to test these things. Modern browsers have good XML
 parsers, so put your test case in a test.xml file and open it in a
 browser. If it isn't well-formed, you'll get an error.

 Here is my test XML:

 root attribute=/

 Here is what Firefox 3.0.3 says about that:

 XML Parsing Error: not well-formed
 Location: file:///Users/wunderwood/Desktop/test.xml
 Line Number 1, Column 18:

 root attribute=/
 -^

 wunder





-- 
--Noble Paul


<    1   2   3   4   5   6   7   8   9   10   >