Re: More debugging DIH - URLDataSource (solved)

2012-08-28 Thread Carrie Coy
Thank you for these suggestions.   The real problem was incorrect syntax 
for the primary key column in data-config.xml.   Once I corrected that, 
the data loaded fine.


wrong:




Right:





On 08/25/2012 08:52 PM, Lance Norskog wrote:

About XPaths: the XPath engine does a limited range of xpaths. The doc
says that your paths are covered.

About logs: You only have the RegexTransformer listed. You need to add
LogTransformer to the transformer list:
http://wiki.apache.org/solr/DataImportHandler#LogTransformer

Having xml entity codes in the url string seems right. Can you verify
the url that goes to the remote site? Can you read the logs at the
remote site? Can you run this code through a proxy and watch the data?

On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy  wrote:

I'm trying to write a DIH to incorporate page view metrics from an XML feed
into our index.   The DIH makes a single request, and updates 0 documents.
I set log level to "finest" for the entire dataimport section, but I still
can't tell what's wrong.  I suspect the XPath.
http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport
returns 404.  Any suggestions on how I can debug this?

*

  solr-spec
  4.0.0.2012.08.06.22.50.47


The XML data:






PRODUCT: BURLAP POTATO
SACKS  (PACK OF 12) (W4537)
2388


PRODUCT: OPAQUE PONY
BEADS 6X9MM  (BAG OF 850) (BE9000)
1313





My DIH:

|
  

  
 https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=**&username=&format=XML&userAuthKey=&language=en_US∓viewID=9475540&period_a=M20110930";
 processor="XPathEntityProcessor"
 stream="true"
 forEach="/ReportDataResponse/Data/Rows/Row"
 logLevel="fine"
 transformer="RegexTransformer">

 
 

  

|

|||This little test perl script correctly extracts the data:|
||
|use XML::XPath;|
|use XML::XPath::XMLParser;|
||
|my $xp = XML::XPath->new(filename =>  'cm.xml');|
|||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
|||foreach my $node ($nodeset->get_nodelist) {|
|||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
|my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
|$page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
|}|

 From logs:

INFO: Loading DIH Configuration: data-config.xml
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
loadDataConfig
INFO: Data Configuration loaded successfully
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
status=0 QTime=2
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
INFO: Starting Full Import
Aug 24, 2012 3:53:10 PM
org.apache.solr.handler.dataimport.SimplePropertiesWriter
readIndexerProperties
INFO: Read dataimport.properties
Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
deleteAll
INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
getData
FINE: Accessing URL:
https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*&username=***&format=XML&userAuthKey=**&language=en_US&viewID=9475540&period_a=M20110930
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=1
Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=1
Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
QTime=0
Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false

Re: Debugging DIH

2012-08-26 Thread Walter Underwood
ISO 8601.

The entire standards is rather complex, so most software uses a subset of it. A 
useful subset is described here: http://www.w3.org/TR/NOTE-datetime

ISO 8601 does not allow "Z001" for milliseconds. The "Z" is for UTC (Zulu in 
military time) and follows the time portion. Milliseconds (or any other 
subdivision) are represented after a decimal point on the seconds value, 
"2012-01-01T01:01:01.001Z".

wunder

On Aug 26, 2012, at 7:34 PM, Lance Norskog wrote:

> The timestamp format is 2012-01-01T01:01:01, with an optional Z001 for
> milliseconds. The timezone is UTC. This is a standard format but I do
> not remember the name of the standard.
> 
> On Sun, Aug 26, 2012 at 2:43 PM, Hasan Diwan  wrote:
>> Mr Norskong, et al,
>> 
>> On 26 August 2012 14:37, Lance Norskog  wrote:
>> 
>>> Also, there is a logging feature to print intermediate values.
>>> 
>> 
>> I see the data as it should be. It's just not recorded into SOLR. One
>> possible concern is that I have timestamp in epoch seconds, which I'd like
>> to store as a date on the SOLR side; I know I can apply a transformer to do
>> this, but what's the format for it? Many thanks! -- H
>> --
>> Sent from my mobile device
>> Envoyait de mon portable







Re: Debugging DIH

2012-08-26 Thread Lance Norskog
The timestamp format is 2012-01-01T01:01:01, with an optional Z001 for
milliseconds. The timezone is UTC. This is a standard format but I do
not remember the name of the standard.

On Sun, Aug 26, 2012 at 2:43 PM, Hasan Diwan  wrote:
> Mr Norskong, et al,
>
> On 26 August 2012 14:37, Lance Norskog  wrote:
>
>> Also, there is a logging feature to print intermediate values.
>>
>
> I see the data as it should be. It's just not recorded into SOLR. One
> possible concern is that I have timestamp in epoch seconds, which I'd like
> to store as a date on the SOLR side; I know I can apply a transformer to do
> this, but what's the format for it? Many thanks! -- H
> --
> Sent from my mobile device
> Envoyait de mon portable



-- 
Lance Norskog
goks...@gmail.com


Re: Debugging DIH

2012-08-26 Thread Hasan Diwan
Mr Norskong, et al,

On 26 August 2012 14:37, Lance Norskog  wrote:

> Also, there is a logging feature to print intermediate values.
>

I see the data as it should be. It's just not recorded into SOLR. One
possible concern is that I have timestamp in epoch seconds, which I'd like
to store as a date on the SOLR side; I know I can apply a transformer to do
this, but what's the format for it? Many thanks! -- H
-- 
Sent from my mobile device
Envoyait de mon portable


Re: Debugging DIH

2012-08-26 Thread Lance Norskog
Also, there is a logging feature to print intermediate values.

Another point is the complexity of your query. It can be easier to
test with the query as a database view, instead of embedding it in the
DIH script.

On Fri, Aug 24, 2012 at 9:00 AM, Ahmet Arslan  wrote:
>
>> That is not completely true. If the columns have the same
>> names as the fields, the mapping is redundant. Nevertheless,
>> it might be the problem. What I've experienced with Oracle,
>> at least, is that the columns would be returned in uppercase
>> even if my alias would be in lowercase. You might force it
>> by adding quotes, though. Or try adding
>>
>> 
>> 
>> 
>>
>> You might check in your preferred SQL client how the column
>> names are returned. It might be an indicator. (At least, in
>> my case they would be uppercase in SQL Developer.)
>
> There is a jsp page for debugging DIH
>
> http://localhost:8080/solr/admin/dataimport.jsp?handler=/dataimport



-- 
Lance Norskog
goks...@gmail.com


Re: More debugging DIH - URLDataSource

2012-08-25 Thread Lance Norskog
About XPaths: the XPath engine does a limited range of xpaths. The doc
says that your paths are covered.

About logs: You only have the RegexTransformer listed. You need to add
LogTransformer to the transformer list:
http://wiki.apache.org/solr/DataImportHandler#LogTransformer

Having xml entity codes in the url string seems right. Can you verify
the url that goes to the remote site? Can you read the logs at the
remote site? Can you run this code through a proxy and watch the data?

On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy  wrote:
> I'm trying to write a DIH to incorporate page view metrics from an XML feed
> into our index.   The DIH makes a single request, and updates 0 documents.
> I set log level to "finest" for the entire dataimport section, but I still
> can't tell what's wrong.  I suspect the XPath.
> http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport
> returns 404.  Any suggestions on how I can debug this?
>
>*
>
>  solr-spec
>  4.0.0.2012.08.06.22.50.47
>
>
> The XML data:
>
> 
> 
> 
> 
> 
> PRODUCT: BURLAP POTATO
> SACKS  (PACK OF 12) (W4537)
> 2388
> 
> 
> PRODUCT: OPAQUE PONY
> BEADS 6X9MM  (BAG OF 850) (BE9000)
> 1313
> 
> 
> 
> 
>
> My DIH:
>
> |
>type="URLDataSource"
>  encoding="UTF-8"
>  connectionTimeout="5000"
>  readTimeout="1"/>
>
>  
>  dataSource="coremetrics"
> pk="id"
>
> url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=**&username=&format=XML&userAuthKey=&language=en_US∓viewID=9475540&period_a=M20110930";
> processor="XPathEntityProcessor"
> stream="true"
> forEach="/ReportDataResponse/Data/Rows/Row"
> logLevel="fine"
> transformer="RegexTransformer"  >
>
>  xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']"
> regex="/^PRODUCT:.*\((.*?)\)$/"  replaceWith="$1"/>
>  xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']"  />
>
>  
> 
> |
>
> |||This little test perl script correctly extracts the data:|
> ||
> |use XML::XPath;|
> |use XML::XPath::XMLParser;|
> ||
> |my $xp = XML::XPath->new(filename => 'cm.xml');|
> |||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
> |||foreach my $node ($nodeset->get_nodelist) {|
> |||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
> |my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
> |$page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
> |}|
>
> From logs:
>
> INFO: Loading DIH Configuration: data-config.xml
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
> loadDataConfig
> INFO: Data Configuration loaded successfully
> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
> status=0 QTime=2
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> INFO: Starting Full Import
> Aug 24, 2012 3:53:10 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
> deleteAll
> INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
> getData
> FINE: Accessing URL:
> https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*&username=***&format=XML&userAuthKey=**&language=en_US&viewID=9475540&period_a=M20110930
> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=1
> Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=1
> Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/sol

More debugging DIH - URLDataSource

2012-08-24 Thread Carrie Coy
I'm trying to write a DIH to incorporate page view metrics from an XML 
feed into our index.   The DIH makes a single request, and updates 0 
documents.  I set log level to "finest" for the entire dataimport 
section, but I still can't tell what's wrong.  I suspect the XPath.   
http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport returns 
404.  Any suggestions on how I can debug this?


   *

 solr-spec
 4.0.0.2012.08.06.22.50.47


The XML data:






PRODUCT: BURLAP 
POTATO SACKS  (PACK OF 12) (W4537)

2388


PRODUCT: OPAQUE PONY 
BEADS 6X9MM  (BAG OF 850) (BE9000)

1313





My DIH:

|
 

 
https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=**&username=&format=XML&userAuthKey=&language=en_US∓viewID=9475540&period_a=M20110930";
processor="XPathEntityProcessor"
stream="true"
forEach="/ReportDataResponse/Data/Rows/Row"
logLevel="fine"
transformer="RegexTransformer"  >



   
 

|

|||This little test perl script correctly extracts the data:|
||
|use XML::XPath;|
|use XML::XPath::XMLParser;|
||
|my $xp = XML::XPath->new(filename => 'cm.xml');|
|||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
|||foreach my $node ($nodeset->get_nodelist) {|
|||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
|my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
|$page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
|}|

From logs:

INFO: Loading DIH Configuration: data-config.xml
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter 
loadDataConfig

INFO: Data Configuration loaded successfully
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import} 
status=0 QTime=2
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter 
doFullImport

INFO: Starting Full Import
Aug 24, 2012 3:53:10 PM 
org.apache.solr.handler.dataimport.SimplePropertiesWriter 
readIndexerProperties

INFO: Read dataimport.properties
Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2 
deleteAll

INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource 
getData
FINE: Accessing URL: 
https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*&username=***&format=XML&userAuthKey=**&language=en_US&viewID=9475540&period_a=M20110930

Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=1

Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=1

Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0

Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start 
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}

Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2

commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]

commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy 
updateCommits

INFO: newest commit = 84
Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher 
INFO: Opening Searcher@ff33d42 main
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 24, 2012 3:53:28 PM org.apache.solr.core.

Re: Debugging DIH

2012-08-24 Thread Ahmet Arslan

> That is not completely true. If the columns have the same
> names as the fields, the mapping is redundant. Nevertheless,
> it might be the problem. What I've experienced with Oracle,
> at least, is that the columns would be returned in uppercase
> even if my alias would be in lowercase. You might force it
> by adding quotes, though. Or try adding
> 
> 
> 
> 
> 
> You might check in your preferred SQL client how the column
> names are returned. It might be an indicator. (At least, in
> my case they would be uppercase in SQL Developer.)

There is a jsp page for debugging DIH

http://localhost:8080/solr/admin/dataimport.jsp?handler=/dataimport


Re: Debugging DIH

2012-08-24 Thread Chantal Ackermann
> 
> I don't see that you have anything in the DIH that tells what columns from 
> the query go into which fields in the index.  You need something like
> 
> 
> 
> 
> 

That is not completely true. If the columns have the same names as the fields, 
the mapping is redundant. Nevertheless, it might be the problem. What I've 
experienced with Oracle, at least, is that the columns would be returned in 
uppercase even if my alias would be in lowercase. You might force it by adding 
quotes, though. Or try adding





You might check in your preferred SQL client how the column names are returned. 
It might be an indicator. (At least, in my case they would be uppercase in SQL 
Developer.)

Cheers,
Chantal

Re: Debugging DIH

2012-08-24 Thread Andy Lester

On Aug 24, 2012, at 9:17 AM, Hasan Diwan wrote:

> 
> url="jdbc:h2:tcp://192.168.1.6/finance" user="sa" />
>
>  
>
> 
> 
> and I've added the appropriate fields to schema.xml:
>  
>   
>   
> 
> There's nothing in my index and 343 rows in my table. What is going on? -- H


I don't see that you have anything in the DIH that tells what columns from the 
query go into which fields in the index.  You need something like





xoa

--
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance



Re: Debugging DIH

2012-08-24 Thread Hasan Diwan
On 24 August 2012 07:17, Hasan Diwan  wrote:

> I have some data in an H2 database that I'd like to move to SOLR. I
> probably should/could extract and post the contents as 1 new document per
> record, but I'd like to configure the data import handler and am having
> some difficulty doing so. Following the wiki instructions[1], I have the
> following in my db-data-config.xml:
> 
>  url="jdbc:h2:tcp://192.168.1.6/finance" user="sa" />
> 
>   
> 
> 
>
> I also have dropped the JDBC driver into db/lib, witness:
> % jar tvf ./lib/h2-1.3.164.jar | grep 'Driver'
> 13 Fri Feb 03 12:02:56 PST 2012 META-INF/services/java.sql.Driver
>   2508 Fri Feb 03 12:02:56 PST 2012 org/h2/Driver.class
>485 Fri Feb 03 12:02:56 PST 2012 org/h2/util/DbDriverActivator.class
>
> and I've added the appropriate fields to schema.xml:
>   
>
>
>
> There's nothing in my index and 343 rows in my table. What is going on? --
> H
>

One more data point:
% curl -L "http://localhost:8983/solr/db/dataimport?command=status";

 

00db-data-config.xmlstatusidle134302012-08-24 07:19:26Indexing completed.
Added/Updated: 0 documents. Deleted 0 documents.2012-08-24 07:19:262012-08-24
07:19:2600:0:0.328This response format is
experimental.  It is likely to change in the future.




-- 
Sent from my mobile device
Envoyait de mon portable


Debugging DIH

2012-08-24 Thread Hasan Diwan
I have some data in an H2 database that I'd like to move to SOLR. I
probably should/could extract and post the contents as 1 new document per
record, but I'd like to configure the data import handler and am having
some difficulty doing so. Following the wiki instructions[1], I have the
following in my db-data-config.xml:



  



I also have dropped the JDBC driver into db/lib, witness:
% jar tvf ./lib/h2-1.3.164.jar | grep 'Driver'
13 Fri Feb 03 12:02:56 PST 2012 META-INF/services/java.sql.Driver
  2508 Fri Feb 03 12:02:56 PST 2012 org/h2/Driver.class
   485 Fri Feb 03 12:02:56 PST 2012 org/h2/util/DbDriverActivator.class

and I've added the appropriate fields to schema.xml:
  
   
   

There's nothing in my index and 343 rows in my table. What is going on? -- H
-- 
Sent from my mobile device
Envoyait de mon portable
1. http://wiki.apache.org/solr/DIHQuickStart


Re: Debugging DIH by placing breakpoints

2011-09-21 Thread Pulkit Singhal
Correct! With that additional info, plus
http://wiki.apache.org/solr/HowToContribute (ant eclipse), plus a
refreshed (close/open) eclipse project ... I'm all set.

Thanks Again.

On Wed, Sep 21, 2011 at 1:43 PM, Gora Mohanty  wrote:
> On Thu, Sep 22, 2011 at 12:08 AM, Pulkit Singhal
>  wrote:
>> Hello,
>>
>> I was wondering where can I find the source code for DIH? I want to
>> checkout the source and step-trhought it breakpoint by breakpoint to
>> understand it better :)
>
> Should be under contrib/dataimporthandler in your Solr source
> tree.
>
> Regards,
> Gora
>


Re: Debugging DIH by placing breakpoints

2011-09-21 Thread Gora Mohanty
On Thu, Sep 22, 2011 at 12:08 AM, Pulkit Singhal
 wrote:
> Hello,
>
> I was wondering where can I find the source code for DIH? I want to
> checkout the source and step-trhought it breakpoint by breakpoint to
> understand it better :)

Should be under contrib/dataimporthandler in your Solr source
tree.

Regards,
Gora


Debugging DIH by placing breakpoints

2011-09-21 Thread Pulkit Singhal
Hello,

I was wondering where can I find the source code for DIH? I want to
checkout the source and step-trhought it breakpoint by breakpoint to
understand it better :)

Thanks!
- Pulkit


Re: Debugging - DIH Delta Queries-

2010-05-25 Thread Chris Hostetter

: Subject: Debugging - DIH Delta Queries- 
: References:
: <1659766275.5213.1274376509278.javamail.r...@vicenza.dmz.lexum.pri>
: In-Reply-To:
: <1659766275.5213.1274376509278.javamail.r...@vicenza.dmz.lexum.pri>

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking




-Hoss



Debugging - DIH Delta Queries-

2010-05-20 Thread Vladimir Sutskever
Hi All,

How can I see all of the queries sent to my DB during a Delta Import?

It seems like my documents are not being updated via delta import
When I use SOLR's DataIMport Handler Console - with delta-import selected I see




−



−





But that’s not very helpful - I want to see the exact queries


Thank You


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  

Re: RegexTransformer debugging (DIH)

2008-10-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
If it is a normal exception it is logged with the number of document
where it failed and you can put it on debugger with start=&rows=1

We do not catch a throwable or Error so it gets slipped through.

if you are adventurous enough wrap the RegexTranformer with your own
and apply that say transformer="my.ReegexWrapper" and catch a
throwable and print out the row.




On Thu, Oct 16, 2008 at 9:49 PM, Jon Baer <[EMAIL PROTECTED]> wrote:
> Is there a way to prevent this from occurring (or a way to nail down the doc
> which is causing it?):
>
> INFO: [news] webapp=/solr path=/admin/dataimport params={command=status}
> status=0 QTime=0
> Exception in thread "Thread-14" java.lang.StackOverflowError
>at java.util.regex.Pattern$Single.match(Pattern.java:3313)
>at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4763)
>at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
>at java.util.regex.Pattern$All.match(Pattern.java:4079)
>at java.util.regex.Pattern$Branch.match(Pattern.java:4538)
>at java.util.regex.Pattern$GroupHead.match(Pattern.java:4578)
>at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4767)
>at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
>at java.util.regex.Pattern$All.match(Pattern.java:4079)
>at java.util.regex.Pattern$Branch.match(Pattern.java:4538)
>at java.util.regex.Pattern$GroupHead.match(Pattern.java:4578)
>at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4767)
>at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
>at java.util.regex.Pattern$All.match(Pattern.java:4079)
>
> Thanks.
>
> - Jon
>
>



-- 
--Noble Paul


RegexTransformer debugging (DIH)

2008-10-16 Thread Jon Baer
Is there a way to prevent this from occurring (or a way to nail down  
the doc which is causing it?):


INFO: [news] webapp=/solr path=/admin/dataimport  
params={command=status} status=0 QTime=0

Exception in thread "Thread-14" java.lang.StackOverflowError
at java.util.regex.Pattern$Single.match(Pattern.java:3313)
at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4763)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
at java.util.regex.Pattern$All.match(Pattern.java:4079)
at java.util.regex.Pattern$Branch.match(Pattern.java:4538)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4578)
at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4767)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
at java.util.regex.Pattern$All.match(Pattern.java:4079)
at java.util.regex.Pattern$Branch.match(Pattern.java:4538)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4578)
at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4767)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4637)
at java.util.regex.Pattern$All.match(Pattern.java:4079)

Thanks.

- Jon