Re: Information for solr-user@lucene.apache.org

2016-05-21 Thread Carl Roberts
   mIKQ==
X-Gm-Message-State: 
AOPr4FXspV1PleDBfvZ+pDSRP4LfZeqSLUtFJ/QJOqy34MGOamvKxEdZqoZ2ZrDEQowYDA==
X-Received: by 10.37.18.84 with SMTP id 81mr4543780ybs.117.1463836110397;
 Sat, 21 May 2016 06:08:30 -0700 (PDT)
Received: from [192.168.1.95] (c-24-99-150-103.hsd1.ga.comcast.net. 
[24.99.150.103])
 by smtp.googlemail.com with ESMTPSA id 
l135sm2118090ywe.27.2016.05.21.06.08.29
 (version=TLSv1/SSLv3 cipher=OTHER);
 Sat, 21 May 2016 06:08:29 -0700 (PDT)
Subject: Re: How to properly query indexed Data
To: solr-user-i...@lucene.apache.org, solr-user-ow...@lucene.apache.org
References: <573d0f03.7070...@gmail.com> <57405d74.6010...@gmail.com>
From: Carl Roberts 
Message-ID: <57405dcd.1010...@gmail.com>
Date: Sat, 21 May 2016 09:08:29 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0)
  Gecko/20100101 Thunderbird/38.7.2
MIME-Version: 1.0
In-Reply-To: <57405d74.6010...@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit





Re: Information for solr-user@lucene.apache.org

2016-05-21 Thread Carl Roberts
Let's try this one (solr-user-digest-subscr...@lucene.apache.org) - 
maybe a real person will answer there.


On 5/21/16 9:09 AM, Carl Roberts wrote:
And, these responses are just wierd.  Do they mean this user list is 
obsolete?  Is solr no longer supported via a user list where we can 
ask questions?


On 5/21/16 9:08 AM, solr-user-h...@lucene.apache.org wrote:

Hi! This is the ezmlm program. I'm managing the
solr-user@lucene.apache.org mailing list.

I'm working for my owner, who can be reached
at solr-user-ow...@lucene.apache.org.

No information has been provided for this list.

--- Administrative commands for the solr-user list ---

I can handle administrative requests automatically. Please
do not send them to the list address! Instead, send
your message to the correct command address:

To subscribe to the list, send a message to:


To remove your address from the list, send a message to:


Send mail to the following for info and FAQ for this list:



Similar addresses exist for the digest list:



To get messages 123 through 145 (a maximum of 100 per request), mail:


To get an index with subject and author for messages 123-456 , mail:


They are always returned as sets of 100, max 2000 per request,
so you'll actually get 100-499.

To receive all messages with the same subject as message 12345,
send a short message to:


The messages should contain one line or word of text to avoid being
treated as sp@m, but I will ignore their content.
Only the ADDRESS you send to is important.

You can start a subscription for an alternate address,
for example "john@host.domain", just add a hyphen and your
address (with '=' instead of '@') after the command word:


To stop subscription for this address, mail:


In both cases, I'll send a confirmation message to that address. When
you receive it, simply reply to it to complete your subscription.

If despite following these instructions, you do not get the
desired results, please contact my owner at
solr-user-ow...@lucene.apache.org. Please be patient, my owner is a
lot slower than I am ;-)

--- Enclosed is a copy of the request I received.

Return-Path: 
Received: (qmail 68327 invoked by uid 99); 21 May 2016 13:08:32 -
Received: from pnap-us-west-generic-nat.apache.org (HELO 
spamd1-us-west.apache.org) (209.188.14.142)
 by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 May 2016 
13:08:32 +

Received: from localhost (localhost [127.0.0.1])
by spamd1-us-west.apache.org (ASF Mail Server at 
spamd1-us-west.apache.org) with ESMTP id 5FE77C223F;

Sat, 21 May 2016 13:08:32 + (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: -0.12
X-Spam-Level:
X-Spam-Status: No, score=-0.12 tagged_above=-999 required=6.31
tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01,
RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, WEIRD_PORT=0.001]
autolearn=disabled
Authentication-Results: spamd1-us-west.apache.org (amavisd-new);
dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-lw-us.apache.org ([10.40.0.8])
by localhost (spamd1-us-west.apache.org [10.40.0.7]) 
(amavisd-new, port 10024)

with ESMTP id f8ZdhY4qqJS5; Sat, 21 May 2016 13:08:31 + (UTC)
Received: from mail-yw0-f178.google.com (mail-yw0-f178.google.com 
[209.85.161.178])
by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) 
with ESMTPS id F1DE85F343;

Sat, 21 May 2016 13:08:30 + (UTC)
Received: by mail-yw0-f178.google.com with SMTP id x194so133283653ywd.0;
 Sat, 21 May 2016 06:08:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20120113;
h=subject:to:references:from:message-id:date:user-agent:mime-version
  :in-reply-to:content-transfer-encoding;
 bh=8E8fo6cUZgiwLyigoru6t7Y1eGoLyyrrNsxsrxC7z2Q=;
b=Y5in/IeBqx6TN06Rl3dtaYVdW3jqzWm20vW46k+NUr2z2AL9+3qQPx4hgogkk4xfSk
RJgHne+Rki1S4qU6+D5pD9QJePw5B+fm9A5bl7oMozl5UFmwWOAAUd41dbX9LBYumo/2
5mIPMvJFS+Ru+1kt9Qd4f8aiIf1jmk8fiHLwLNQ9JKhx7y0vZd2N/Q6+f40jCyi1rFro
oEHes13s00riXATsdDV5dIkPd1OVSNC3J+C2Moe5Nn4X9kOsbgul3LpdYiiuIQYcqcuw
xz9LFKOwJ3EbCq0GKF+sk4i5UUaZbX7AInS2gHMrdNHkxktBsoU8tr36X05KQ8CA2I89
  Sh4Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
h=x-gm-message-state:subject:to:references:from:message-id:date
:user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=8E8fo6cUZgiwLyigoru6t7Y1eGoLyyrrNsxsrxC7z2Q=;
b=IFIJieLyb6JCKlIAVy7jZONu18DULolIzqbhaT89wmvTW6KyOVkoYYelnd9PKSOAhK
3R4i5CSKk+Cw0T6YkhF8P1SF/jt9zoRaniuFY1qrxbJOX7M8b7RmAdsTfsq4cZhR7mMn
IhWP3Z7yyhgo2RgayUL0z7uQw/jb4ovyoZK/Ryb1Ny7rodkuXcD/kcQPFsEGquF3IEqk
BzZrddn7bk0IhmPwurFgsmE2ygXdgz8yaFx71EizW0+rQ73s9lhFkTVguKKAg4AWEKhF
QVYqiMHg4ogiZm7jf/6HlnGaWNJvdncO8vbl1vumSTTnz/V0GfTkOOd5h5H6nauJbX6b
  

How to perform a contains query

2016-05-25 Thread Carl Roberts

Hi,

Sorry to ask this question again but I had some recent issues with SPAM 
filtering so I don't know if someone responded before or not.


Basically I am looking for a way to query a field for a substring in it 
with functionality similar to what Java String.contains would return:


So for example, If I have these 2 summary field values:

"summary": "Apache Tomcat 7.x before 7.0.10 does not follow
: > ServletSecurity annotations, which allows remote attackers to bypass
: > intended access
: > restrictions via HTTP requests to a web application.",
: >
"summary": "Apache Tomcat 7.0.0 through 7.0.6 and 6.0.0 through
: > 6.0.30 does not enforce the maxHttpHeaderSize limit for requests 
involving
: > the NIO HTTP connector, which allows remote attackers to cause a 
denial of

: > service (OutOfMemoryError) via a crafted request.",

I want to be able to provide a query that states, give me all records 
that have a field summary that contain "Apache Tomcat 7"  so that both 
fields above are returned.


Is there a way to do that?

Regards,

Joe

On 5/23/16 12:37 PM, Chris Hostetter wrote:

The mailing list you are looking for is "solr-user@lucene.apache.org"

solr-user-info is an automated bot for giving you info about the list
solr-user-owner is for contacting the human moderators of the mailing 
list

with help




: Date: Sat, 21 May 2016 09:07:00 -0400
: From: Carl Roberts 
: To: solr-user-i...@lucene.apache.org, solr-user-ow...@lucene.apache.org
: Subject: Re: How to properly query indexed Data
:
: What is a reasonable time to expect an answer to questions in this 
user list?

:
: On 5/18/16 8:55 PM, Carl Roberts wrote:
: > Hi,
: >
: > I am using Solar 4.10.3 and I the default URL for the GUI that 
comes with

: > Solar:
: >
: > This is the URL: http://10.1.161.23:8983/solr/#/nvd-rss/query
: >
: > I have the following entries with field summary that are indexed.
: >
: > If I search for summary:"Apache Tomcat 7" I only get 10 results 
and the ones

: > with Apache Tomcat 7.0.0 in the summary are missing from the results.
: >
: > If I search for summary:"Apache Tomcat 7.0.0" I only get the 3 
with the

: > "Apache Tomcat 7.0.0 in the summary.
: >
: > How do I get all of them?  What filter should I use?  I guess I am 
looking

: > for a filter that says this:
: >
: > Give me all Entries that start with "Apache Tomcat 7" where 7 can be
: > followed by 0.0 as in 7.0.0 or it can be followed by x as in 7.x 
or anything

: > else.
: >
: > How do I do that?
: >
: > ~/dev/temp$ grep  "Apache Tomcat 7" apache-tomcat-query.txt
: >
: > "summary": "Apache Tomcat 7.x before 7.0.10 does not follow
: > ServletSecurity annotations, which allows remote attackers to bypass
: > intended access
: > restrictions via HTTP requests to a web application.",
: >
: > "summary": "Apache Tomcat 7.0.0 through 7.0.6 and 6.0.0 
through
: > 6.0.30 does not enforce the maxHttpHeaderSize limit for requests 
involving
: > the NIO HTTP connector, which allows remote attackers to cause a 
denial of

: > service (OutOfMemoryError) via a crafted request.",
: >
: > "summary": 
"org/apache/catalina/core/DefaultInstanceManager.java in
: > Apache Tomcat 7.x before 7.0.22 does not properly restrict 
ContainerServlets
: > in the Manager application, which allows local users to gain 
privileges by
: > using an untrusted web application to access the Manager 
application's

: > functionality.",
: >
: > "summary": "Unrestricted file upload vulnerability in 
Apache Tomcat
: > 7.x before 7.0.40, in certain situations involving outdated 
java.io.File
: > code and a custom JMX configuration, allows remote attackers to 
execute

: > arbitrary code by uploading and accessing a JSP file.",
: >
: > "summary": "A certain tomcat7 package for Apache Tomcat 7 
in Red Hat
: > Enterprise Linux (RHEL) 7 allows remote attackers to cause a 
denial of
: > service (CPU consumption) via a crafted request.  NOTE: this 
vulnerability

: > exists because of an unspecified regression.",
: >
: > "summary": "Apache Tomcat 7.0.0 through 7.0.3, 6.0.x, and 
5.5.x,
: > when running within a SecurityManager, does not make the 
ServletContext
: > attribute read-only, which allows local web applications to read 
or write
: > files outside of the intended working directory, as demonstrated 
using a

: > directory traversal attack.",
: >
: > "summary": "Apache Tomcat 7.0.11, when web.xml has no login
: > configuration, does not follow security constraints, which allows 
remote
: > a

Need Help with custom ZIPURLDataSource class

2015-01-23 Thread Carl Roberts


Hi,

I created a custom ZIPURLDataSource class to unzip the content from an
http URL for an XML ZIP file and it seems to be working (at least I have
no errors), but no data is imported.

Here is my configuration in rss-data-config.xml:




https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry"
transformer="DateFormatTransformer">













Attached is the ZIPURLDataSource.java file.

It actually unzips and saves the raw XML to disk, which I have verified to be a 
valid XML file.  The file has one or more entries (here is an example):

http://scap.nist.gov/schema/scap-core/0.1";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:patch="http://scap.nist.gov/schema/patch/0.1";
xmlns:vuln="http://scap.nist.gov/schema/vulnerability/0.4";
xmlns:cvss="http://scap.nist.gov/schema/cvss-v2/0.2";
xmlns:cpe-lang="http://cpe.mitre.org/language/2.0";
xmlns="http://scap.nist.gov/schema/feed/vulnerability/2.0";
pub_date="2015-01-10T05:37:05"
xsi:schemaLocation="http://scap.nist.gov/schema/patch/0.1
http://nvd.nist.gov/schema/patch_0.1.xsd
http://scap.nist.gov/schema/scap-core/0.1
http://nvd.nist.gov/schema/scap-core_0.1.xsd
http://scap.nist.gov/schema/feed/vulnerability/2.0
http://nvd.nist.gov/schema/nvd-cve-feed_2.0.xsd"; nvd_xml_version="2.0">

http://nvd.nist.gov/";>



























cpe:/o:freebsd:freebsd:2.2.8
cpe:/o:freebsd:freebsd:1.1.5.1
cpe:/o:freebsd:freebsd:2.2.3
cpe:/o:freebsd:freebsd:2.2.2
cpe:/o:freebsd:freebsd:2.2.5
cpe:/o:freebsd:freebsd:2.2.4
cpe:/o:freebsd:freebsd:2.0.5
cpe:/o:freebsd:freebsd:2.2.6
cpe:/o:freebsd:freebsd:2.1.6.1
cpe:/o:freebsd:freebsd:2.0.1
cpe:/o:freebsd:freebsd:2.2
cpe:/o:freebsd:freebsd:2.0
cpe:/o:openbsd:openbsd:2.3
cpe:/o:freebsd:freebsd:3.0
cpe:/o:freebsd:freebsd:1.1
cpe:/o:freebsd:freebsd:2.1.6
cpe:/o:openbsd:openbsd:2.4
cpe:/o:bsdi:bsd_os:3.1
cpe:/o:freebsd:freebsd:1.0
cpe:/o:freebsd:freebsd:2.1.7
cpe:/o:freebsd:freebsd:1.2
cpe:/o:freebsd:freebsd:2.1.5
cpe:/o:freebsd:freebsd:2.1.7.1

CVE-1999-0001
1999-12-30T00:00:00.000-05:00
2010-12-16T00:00:00.000-05:00


5.0
NETWORK
LOW
NONE
NONE
NONE
PARTIAL
http://nvd.nist.gov
2004-01-01T00:00:00.000-05:00




OSVDB
http://www.osvdb.org/5707";
xml:lang="en">5707


CONFIRM
http://www.openbsd.org/errata23.html#tcpfix";
xml:lang="en">http://www.openbsd.org/errata23.html#tcpfix

ip_input.c in BSD-derived TCP/IP implementations allows
remote attackers to cause a denial of service (crash or hang) via
crafted packets.



Here is the curl command:

curl http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import

And here is the output from the console for Jetty:

main{StandardDirectoryReader(segments_1:1:nrt)}
2407 [coreLoadExecutor-5-thread-1] INFO
org.apache.solr.core.CoreContainer – registering core: nvd-rss
2409 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
user.dir=/Users/carlroberts/dev/solr-4.10.3/example
2409 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
SolrDispatchFilter.init() done
2431 [main] INFO org.eclipse.jetty.server.AbstractConnector – Started
SocketConnector@0.0.0.0:8983
2450 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
[nvd-rss] webapp=null path=null
params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false}
hits=0 status=0 QTime=43
2451 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
QuerySenderListener done.
2451 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SpellCheckComponent – Loading spell
index for spellchecker: default
2451 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SpellCheckComponent – Loading spell
index for spellchecker: wordbreak
2452 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SuggestComponent – Loading suggester
index for: mySuggester
2452 [searcherExecutor-6-thread-1] INFO
org.apache.solr.spelling.suggest.SolrSuggester – reload()
2452 [searcherExecutor-6-thread-1] INFO
org.apache.solr.spelling.suggest.SolrSuggester – build()
2459 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
[nvd-rss] Registered new searcher Searcher@df9e84e[nvd-rss]
main{StandardDirectoryReader(segments_1:1:nrt)}
8371 [qtp1640586218-17] INFO
org.apache.solr.handler.dataimport.DataImporter – Loading DIH
Configuration: rss-data-config.xml
8379 [qtp1640586218-17] INFO
org.apache.solr.handler.dataimport.DataImporter – Data Configuration
loaded successfully
8383 [Thread-15] INFO org.apache.solr.handler.dataimport.DataImporter –
Starting Full Import
8384 [qtp1640586218-17] INFO org.apache.solr.core.SolrCore – [nvd-rss]
webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=15
8396 [Thread-15] INFO
org.apache.solr.handler.dataimport.SimplePropertiesWriter – Read
dataimport.properties
23431 [commitScheduler-8-thread-1] INFO
org.apache.solr.update.UpdateHandler – start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommi

Need help importing data

2015-01-23 Thread Carl Roberts

Hi,

I have set log4j logging to level DEBUG and I have also modified the 
code to see what is being imported and I can see the nextRow() records, 
and the import is successful, however I have no data.  Can someone 
please help me figure this out?


Here is the logging output:

ow:  r1={{id=CVE-2002-2353, cve=CVE-2002-2353, cwe=CWE-264, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2353, cve=CVE-2002-2353, cwe=CWE-264, $forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2354, cve=CVE-2002-2354, cwe=CWE-20, $forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2354, cve=CVE-2002-2354, cwe=CWE-20, $forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2355, cve=CVE-2002-2355, cwe=CWE-255, $forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2355, cve=CVE-2002-2355, cwe=CWE-255, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2356, cve=CVE-2002-2356, cwe=CWE-264, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2356, cve=CVE-2002-2356, cwe=CWE-264, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2357, cve=CVE-2002-2357, cwe=CWE-119, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2357, cve=CVE-2002-2357, cwe=CWE-119, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2358, cve=CVE-2002-2358, cwe=CWE-79, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2358, cve=CVE-2002-2358, cwe=CWE-79, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2359, cve=CVE-2002-2359, cwe=CWE-79, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2359, cve=CVE-2002-2359, cwe=CWE-79, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2360, cve=CVE-2002-2360, cwe=CWE-264, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2360, cve=CVE-2002-2360, cwe=CWE-264, $forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPath

Re: Need Help with custom ZIPURLDataSource class

2015-01-23 Thread Carl Roberts

NVM - I have this working.

The problem was this:  pk="link" in rss-dat.config.xml but unique id not 
link in schema.xml - it is id.


From rss-data-config.xml:


url="https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";

processor="XPathEntityProcessor"
forEach="/nvd/entry">

commonField="true" />
commonField="true" />




From schema.xml:

* id

*What really bothers me is that there were no errors output by Solr to 
indicate this type of misconfiguration error and all the messages that 
Solr gave indicated the import was successful.  This lack of appropriate 
error reporting is a pain, especially for someone learning Solr.


Switching pk="link" to pk="id" solved the problem and I was then able to 
import the data.

On 1/23/15, 6:34 PM, Carl Roberts wrote:


Hi,

I created a custom ZIPURLDataSource class to unzip the content from an
http URL for an XML ZIP file and it seems to be working (at least I have
no errors), but no data is imported.

Here is my configuration in rss-data-config.xml:




https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry"
transformer="DateFormatTransformer">




xpath="/nvd/entry/vulnerable-software-list/product" 
commonField="false" />









Attached is the ZIPURLDataSource.java file.

It actually unzips and saves the raw XML to disk, which I have 
verified to be a valid XML file.  The file has one or more entries 
(here is an example):


http://scap.nist.gov/schema/scap-core/0.1";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:patch="http://scap.nist.gov/schema/patch/0.1";
xmlns:vuln="http://scap.nist.gov/schema/vulnerability/0.4";
xmlns:cvss="http://scap.nist.gov/schema/cvss-v2/0.2";
xmlns:cpe-lang="http://cpe.mitre.org/language/2.0";
xmlns="http://scap.nist.gov/schema/feed/vulnerability/2.0";
pub_date="2015-01-10T05:37:05"
xsi:schemaLocation="http://scap.nist.gov/schema/patch/0.1
http://nvd.nist.gov/schema/patch_0.1.xsd
http://scap.nist.gov/schema/scap-core/0.1
http://nvd.nist.gov/schema/scap-core_0.1.xsd
http://scap.nist.gov/schema/feed/vulnerability/2.0
http://nvd.nist.gov/schema/nvd-cve-feed_2.0.xsd"; nvd_xml_version="2.0">

http://nvd.nist.gov/";>



























cpe:/o:freebsd:freebsd:2.2.8
cpe:/o:freebsd:freebsd:1.1.5.1
cpe:/o:freebsd:freebsd:2.2.3
cpe:/o:freebsd:freebsd:2.2.2
cpe:/o:freebsd:freebsd:2.2.5
cpe:/o:freebsd:freebsd:2.2.4
cpe:/o:freebsd:freebsd:2.0.5
cpe:/o:freebsd:freebsd:2.2.6
cpe:/o:freebsd:freebsd:2.1.6.1
cpe:/o:freebsd:freebsd:2.0.1
cpe:/o:freebsd:freebsd:2.2
cpe:/o:freebsd:freebsd:2.0
cpe:/o:openbsd:openbsd:2.3
cpe:/o:freebsd:freebsd:3.0
cpe:/o:freebsd:freebsd:1.1
cpe:/o:freebsd:freebsd:2.1.6
cpe:/o:openbsd:openbsd:2.4
cpe:/o:bsdi:bsd_os:3.1
cpe:/o:freebsd:freebsd:1.0
cpe:/o:freebsd:freebsd:2.1.7
cpe:/o:freebsd:freebsd:1.2
cpe:/o:freebsd:freebsd:2.1.5
cpe:/o:freebsd:freebsd:2.1.7.1

CVE-1999-0001
1999-12-30T00:00:00.000-05:00 

2010-12-16T00:00:00.000-05:00 




5.0
NETWORK
LOW
NONE
NONE
NONE
PARTIAL
http://nvd.nist.gov
2004-01-01T00:00:00.000-05:00 






OSVDB
http://www.osvdb.org/5707";
xml:lang="en">5707


CONFIRM
http://www.openbsd.org/errata23.html#tcpfix";
xml:lang="en">http://www.openbsd.org/errata23.html#tcpfix 



ip_input.c in BSD-derived TCP/IP implementations allows
remote attackers to cause a denial of service (crash or hang) via
crafted packets.



Here is the curl command:

curl http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import

And here is the output from the console for Jetty:

main{StandardDirectoryReader(segments_1:1:nrt)}
2407 [coreLoadExecutor-5-thread-1] INFO
org.apache.solr.core.CoreContainer – registering core: nvd-rss
2409 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
user.dir=/Users/carlroberts/dev/solr-4.10.3/example
2409 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
SolrDispatchFilter.init() done
2431 [main] INFO org.eclipse.jetty.server.AbstractConnector – Started
SocketConnector@0.0.0.0:8983
2450 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
[nvd-rss] webapp=null path=null
params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false} 


hits=0 status=0 QTime=43
2451 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
QuerySenderListener done.
2451 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SpellCheckComponent – Loading spell
index for spellchecker: default
2451 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SpellCheckComponent – Loading spell
index

Re: Need help importing data

2015-01-23 Thread Carl Roberts

NVM

I figured this out.  The problem was this:  pk="link" in 
rss-dat.config.xml but unique id not link in schema.xml - it is id.


From rss-data-config.xml:

https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry">

commonField="true" />
commonField="true" />




From schema.xml:

* id

*What really bothers me is that there were no errors output by Solr to 
indicate this type of misconfiguration error and all the messages that 
Solr gave indicated the import was successful.  This lack of appropriate 
error reporting is a pain, especially for someone learning Solr.


Switching pk="link" to pk="id" solved the problem and I was then able to 
import the data.



On 1/23/15, 9:39 PM, Carl Roberts wrote:

Hi,

I have set log4j logging to level DEBUG and I have also modified the 
code to see what is being imported and I can see the nextRow() 
records, and the import is successful, however I have no data. Can 
someone please help me figure this out?


Here is the logging output:

ow:  r1={{id=CVE-2002-2353, cve=CVE-2002-2353, cwe=CWE-264, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2353, cve=CVE-2002-2353, cwe=CWE-264, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2354, cve=CVE-2002-2354, cwe=CWE-20, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2354, cve=CVE-2002-2354, cwe=CWE-20, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2355, cve=CVE-2002-2355, cwe=CWE-255, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,606- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2355, cve=CVE-2002-2355, cwe=CWE-255, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2356, cve=CVE-2002-2356, cwe=CWE-264, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2356, cve=CVE-2002-2356, cwe=CWE-264, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2357, cve=CVE-2002-2357, cwe=CWE-119, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2357, cve=CVE-2002-2357, cwe=CWE-119, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2358, cve=CVE-2002-2358, cwe=CWE-79, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r3={{id=CVE-2002-2358, cve=CVE-2002-2358, cwe=CWE-79, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:221]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
URL={url}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:227]-org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow: 
r1={{id=CVE-2002-2359, cve=CVE-2002-2359, cwe=CWE-79, 
$forEach=/nvd/entry}}
2015-01-23 21:28:04,607- 
INFO-[Thread-15]-[XPathEntityProcessor.java:251

How do you parse the data in a field that is returned from a query?

2015-01-24 Thread Carl Roberts

Hi,

How can I parse the data in a field that is returned from a query?

Basically,

I have a multi-valued field that contains values such as these that are 
returned from a query:


  "cpe:/o:freebsd:freebsd:1.1.5.1",
  "cpe:/o:freebsd:freebsd:2.2.3",
  "cpe:/o:freebsd:freebsd:2.2.2",
  "cpe:/o:freebsd:freebsd:2.2.5",
  "cpe:/o:freebsd:freebsd:2.2.4",
  "cpe:/o:freebsd:freebsd:2.0.5",
  "cpe:/o:freebsd:freebsd:2.2.6",
  "cpe:/o:freebsd:freebsd:2.1.6.1",
  "cpe:/o:freebsd:freebsd:2.0.1",
  "cpe:/o:freebsd:freebsd:2.2",
  "cpe:/o:freebsd:freebsd:2.0",
  "cpe:/o:openbsd:openbsd:2.3",
  "cpe:/o:freebsd:freebsd:3.0",
  "cpe:/o:freebsd:freebsd:1.1",
  "cpe:/o:freebsd:freebsd:2.1.6",
  "cpe:/o:openbsd:openbsd:2.4",
  "cpe:/o:bsdi:bsd_os:3.1",
  "cpe:/o:freebsd:freebsd:1.0",
  "cpe:/o:freebsd:freebsd:2.1.7",
  "cpe:/o:freebsd:freebsd:1.2",
  "cpe:/o:freebsd:freebsd:2.1.5",
  "cpe:/o:freebsd:freebsd:2.1.7.1"],

And my problem is that I need to strip the cpe:/o part and I also need 
to tokenize words using the (:) as a separator so that I can then search 
for "freebsd 1.1" or "openbsd 2.4" or just "freebsd".


Thanks in advance.

Joe


Re: How do you parse the data in a field that is returned from a query?

2015-01-24 Thread Carl Roberts

Sorry if I was not clear.  What I am asking is this:

How can I parse the data during import to tokenize it by (:) and strip 
the cpe:/o?



On 1/24/15, 3:28 PM, Alexandre Rafalovitch wrote:

You are using keywords here that seem to contradict with each other.
Or your use case is not clear.

Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!

But then at the end, you say you want to search for whatever you
stripped off. So, that should be back in Solr again?

Or are you asking something along these lines:
1. I have a multiValued field with the following sample content... (it
does not matter to Solr where it comes from)
2. I wanted it returned as is, but I want to be able to find documents
when somebody searches for X, Y, or Z
3. What would be the best analyzer chain to be able to do so?

Regards,
Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 24 January 2015 at 15:04, Carl Roberts  wrote:

Hi,

How can I parse the data in a field that is returned from a query?

Basically,

I have a multi-valued field that contains values such as these that are
returned from a query:

   "cpe:/o:freebsd:freebsd:1.1.5.1",
   "cpe:/o:freebsd:freebsd:2.2.3",
   "cpe:/o:freebsd:freebsd:2.2.2",
   "cpe:/o:freebsd:freebsd:2.2.5",
   "cpe:/o:freebsd:freebsd:2.2.4",
   "cpe:/o:freebsd:freebsd:2.0.5",
   "cpe:/o:freebsd:freebsd:2.2.6",
   "cpe:/o:freebsd:freebsd:2.1.6.1",
   "cpe:/o:freebsd:freebsd:2.0.1",
   "cpe:/o:freebsd:freebsd:2.2",
   "cpe:/o:freebsd:freebsd:2.0",
   "cpe:/o:openbsd:openbsd:2.3",
   "cpe:/o:freebsd:freebsd:3.0",
   "cpe:/o:freebsd:freebsd:1.1",
   "cpe:/o:freebsd:freebsd:2.1.6",
   "cpe:/o:openbsd:openbsd:2.4",
   "cpe:/o:bsdi:bsd_os:3.1",
   "cpe:/o:freebsd:freebsd:1.0",
   "cpe:/o:freebsd:freebsd:2.1.7",
   "cpe:/o:freebsd:freebsd:1.2",
   "cpe:/o:freebsd:freebsd:2.1.5",
   "cpe:/o:freebsd:freebsd:2.1.7.1"],

And my problem is that I need to strip the cpe:/o part and I also need to
tokenize words using the (:) as a separator so that I can then search for
"freebsd 1.1" or "openbsd 2.4" or just "freebsd".

Thanks in advance.

Joe




Re: How do you parse the data in a field that is returned from a query?

2015-01-24 Thread Carl Roberts
Yes - I am using DIH and I am reading the info from an XML file using 
the URL datasource, and I want to strip the cpe:/o and tokenize the data 
by (:) during import so I can then search it as I've described. So, my 
question is this:


Is there any built in logic via a transformer class that could do this?  
If not, how would you recommend I do this?


Regards,

Joe

On 1/24/15, 3:38 PM, Jack Krupansky wrote:

Or, maybe... he's using DIH and getting these values from an RDBMS database
query and now wants to index them in Solr. Who knows!

It might be simplest to transform the colons to spaces and use a normal
text field. Although you could use a custom text field type that used a
regex tokenizer which treated the colons as token separators.


-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:28 PM, Alexandre Rafalovitch 
wrote:


You are using keywords here that seem to contradict with each other.
Or your use case is not clear.

Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!

But then at the end, you say you want to search for whatever you
stripped off. So, that should be back in Solr again?

Or are you asking something along these lines:
1. I have a multiValued field with the following sample content... (it
does not matter to Solr where it comes from)
2. I wanted it returned as is, but I want to be able to find documents
when somebody searches for X, Y, or Z
3. What would be the best analyzer chain to be able to do so?

Regards,
Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 24 January 2015 at 15:04, Carl Roberts 
wrote:

Hi,

How can I parse the data in a field that is returned from a query?

Basically,

I have a multi-valued field that contains values such as these that are
returned from a query:

   "cpe:/o:freebsd:freebsd:1.1.5.1",
   "cpe:/o:freebsd:freebsd:2.2.3",
   "cpe:/o:freebsd:freebsd:2.2.2",
   "cpe:/o:freebsd:freebsd:2.2.5",
   "cpe:/o:freebsd:freebsd:2.2.4",
   "cpe:/o:freebsd:freebsd:2.0.5",
   "cpe:/o:freebsd:freebsd:2.2.6",
   "cpe:/o:freebsd:freebsd:2.1.6.1",
   "cpe:/o:freebsd:freebsd:2.0.1",
   "cpe:/o:freebsd:freebsd:2.2",
   "cpe:/o:freebsd:freebsd:2.0",
   "cpe:/o:openbsd:openbsd:2.3",
   "cpe:/o:freebsd:freebsd:3.0",
   "cpe:/o:freebsd:freebsd:1.1",
   "cpe:/o:freebsd:freebsd:2.1.6",
   "cpe:/o:openbsd:openbsd:2.4",
   "cpe:/o:bsdi:bsd_os:3.1",
   "cpe:/o:freebsd:freebsd:1.0",
   "cpe:/o:freebsd:freebsd:2.1.7",
   "cpe:/o:freebsd:freebsd:1.2",
   "cpe:/o:freebsd:freebsd:2.1.5",
   "cpe:/o:freebsd:freebsd:2.1.7.1"],

And my problem is that I need to strip the cpe:/o part and I also need to
tokenize words using the (:) as a separator so that I can then search for
"freebsd 1.1" or "openbsd 2.4" or just "freebsd".

Thanks in advance.

Joe




Re: How do you parse the data in a field that is returned from a query?

2015-01-24 Thread Carl Roberts
Via this rss-data-config.xml file and a class that I wrote (attached) to 
download and XML file from a ZIP URL:



readTimeout="3"/>


https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry">
commonField="false" />
commonField="false" />
commonField="false" />
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" 
commonField="false" />
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false" />
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />


http://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2003.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry">
commonField="false" />
commonField="false" />
commonField="false" />
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" 
commonField="false" />
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false" />
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />






On 1/24/15, 3:45 PM, Jack Krupansky wrote:

How are you currently importing data?

-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts 
wrote:
Sorry if I was not clear.  What I am asking is this:

How can I parse the data during import to tokenize it by (:) and strip the
cpe:/o?



On 1/24/15, 3:28 PM, Alexandre Rafalovitch wrote:


You are using keywords here that seem to contradict with each other.
Or your use case is not clear.

Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!

But then at the end, you say you want to search for whatever you
stripped off. So, that should be back in Solr again?

Or are you asking something along these lines:
1. I have a multiValued field with the following sample content... (it
does not matter to Solr where it comes from)
2. I wanted it returned as is, but I want to be able to find documents
when somebody searches for X, Y, or Z
3. What would be the best analyzer chain to be able to do so?

Regards,
 Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 24 January 2015 at 15:04, Carl Roberts 
wrote:


Hi,

How can I parse the data in a field that is returned from a query?

Basically,

I have a multi-valued field that contains values such as these that are
returned from a query:

"cpe:/o:freebsd:freebsd:1.1.5.1",
"cpe:/o:freebsd:freebsd:2.2.3",
"cpe:/o:freebsd:freebsd:2.2.2",
"cpe:/o:freebsd:freebsd:2.2.5",
"cpe:/o:freebsd:freebsd:2.2.4",
"cpe:/o:freebsd:freebsd:2.0.5",
"cpe:/o:freebsd:freebsd:2.2.6",
"cpe:/o:freebsd:freebsd:2.1.6.1",
"cpe:/o:freebsd:freebsd:2.0.1",
"cpe:/o:freebsd:freebsd:2.2",
"cpe:/o:freebsd:freebsd:2.0",
"cpe:/o:openbsd:openbsd:2.3",
"cpe:/o:freebsd:freebsd:3.0",
"cpe:/o:freebsd:freebsd:1.1",
"cpe:/o:freebsd:freebsd:2.1.6",
"cpe:/o:openbsd:openbsd:2.4",
"cpe:/o:bsdi:bsd_os:3.1",
"cpe:/o:freebsd:freebsd:1.0",
"cpe:/o:freebsd:freebsd:2.1.7",
"cpe:/o:freebsd:freebsd:1.2",
"cpe:/o:freebsd:freebsd:2.1.5",
"cpe:/o:freebsd:freebsd:2.1.7.1"],

And my problem is that I need to strip the cpe:/o part and I also need to
tokenize words using the (:) as a separator so that I can then search for
"freebsd 1.1" or "openbsd 2.4" or just "freebsd".

Thanks in advance.

Joe



package org.apache.solr.handler.dataimport;

import java.util.zip.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.*;
import java.net.URL;
import java.net.URLConnection;
import java.nio.charset.StandardCharsets;
import java.util.Properties;
import java.ut

Re: How do you parse the data in a field that is returned from a query?

2015-01-24 Thread Carl Roberts

The unzipped XML that I am reading looks like this:



http://scap.nist.gov/schema/scap-core/0.1"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xmlns:patch="http://scap.nist.gov/schema/patch/0.1"; 
xmlns:vuln="http://scap.nist.gov/schema/vulnerability/0.4"; 
xmlns:cvss="http://scap.nist.gov/schema/cvss-v2/0.2"; 
xmlns:cpe-lang="http://cpe.mitre.org/language/2.0"; 
xmlns="http://scap.nist.gov/schema/feed/vulnerability/2.0"; 
pub_date="2015-01-10T05:37:05" 
xsi:schemaLocation="http://scap.nist.gov/schema/patch/0.1 
http://nvd.nist.gov/schema/patch_0.1.xsd 
http://scap.nist.gov/schema/scap-core/0.1 
http://nvd.nist.gov/schema/scap-core_0.1.xsd 
http://scap.nist.gov/schema/feed/vulnerability/2.0 
http://nvd.nist.gov/schema/nvd-cve-feed_2.0.xsd"; nvd_xml_version="2.0">

  
http://nvd.nist.gov/";>
  























  


cpe:/o:freebsd:freebsd:2.2.8
cpe:/o:freebsd:freebsd:1.1.5.1
cpe:/o:freebsd:freebsd:2.2.3
cpe:/o:freebsd:freebsd:2.2.2
cpe:/o:freebsd:freebsd:2.2.5
cpe:/o:freebsd:freebsd:2.2.4
cpe:/o:freebsd:freebsd:2.0.5
cpe:/o:freebsd:freebsd:2.2.6
cpe:/o:freebsd:freebsd:2.1.6.1
cpe:/o:freebsd:freebsd:2.0.1
cpe:/o:freebsd:freebsd:2.2
cpe:/o:freebsd:freebsd:2.0
cpe:/o:openbsd:openbsd:2.3
cpe:/o:freebsd:freebsd:3.0
cpe:/o:freebsd:freebsd:1.1
cpe:/o:freebsd:freebsd:2.1.6
cpe:/o:openbsd:openbsd:2.4
cpe:/o:bsdi:bsd_os:3.1
cpe:/o:freebsd:freebsd:1.0
cpe:/o:freebsd:freebsd:2.1.7
cpe:/o:freebsd:freebsd:1.2
cpe:/o:freebsd:freebsd:2.1.5
cpe:/o:freebsd:freebsd:2.1.7.1

CVE-1999-0001
1999-12-30T00:00:00.000-05:00
2010-12-16T00:00:00.000-05:00

  
5.0
NETWORK
LOW
NONE
NONE
NONE
PARTIAL
http://nvd.nist.gov
2004-01-01T00:00:00.000-05:00
  



  OSVDB
  http://www.osvdb.org/5707"; 
xml:lang="en">5707



  CONFIRM
  href="http://www.openbsd.org/errata23.html#tcpfix"; 
xml:lang="en">http://www.openbsd.org/errata23.html#tcpfix


ip_input.c in BSD-derived TCP/IP implementations 
allows remote attackers to cause a denial of service (crash or hang) via 
crafted packets.

  

On 1/24/15, 3:49 PM, Carl Roberts wrote:
Via this rss-data-config.xml file and a class that I wrote (attached) 
to download and XML file from a ZIP URL:



readTimeout="3"/>


https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry">
commonField="false" />
commonField="false" />
commonField="false" />
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" commonField="false" 
/>
xpath="/nvd/entry/vulnerable-software-list/product" 
commonField="false" />
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />


http://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2003.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry">
commonField="false" />
commonField="false" />
commonField="false" />
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" commonField="false" 
/>
xpath="/nvd/entry/vulnerable-software-list/product" 
commonField="false" />
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />






On 1/24/15, 3:45 PM, Jack Krupansky wrote:

How are you currently importing data?

-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts 

wrote:
Sorry if I was not clear.  What I am asking is this:

How can I parse the data during import to tokenize it by (:) and 
strip the

cpe:/o?



On 1/24/15, 3:28 PM, Alexandre Rafalovitch wrote:


You are using keywords here that seem to contradict with each other.
Or your use case is not clear.

Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!

Re: How do you parse the data in a field that is returned from a query?

2015-01-24 Thread Carl Roberts

Thanks Jack.

On 1/24/15, 3:57 PM, Jack Krupansky wrote:

Take a look at the RegexTransformer. Or,in some cases your may need to use
the raw ScriptTransformer.

See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:49 PM, Carl Roberts 
wrote:
Via this rss-data-config.xml file and a class that I wrote (attached) to
download and XML file from a ZIP URL:


 
 
 https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
 processor="XPathEntityProcessor"
 forEach="/nvd/entry">
 
 
 
 
 
 
 
 
 
 http://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2003.xml.zip";
 processor="XPathEntityProcessor"
 forEach="/nvd/entry">
 
 
 
 
 
 
 
 
 
 
 



On 1/24/15, 3:45 PM, Jack Krupansky wrote:


How are you currently importing data?

-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts <
carl.roberts.zap...@gmail.com


wrote:
Sorry if I was not clear.  What I am asking is this:

How can I parse the data during import to tokenize it by (:) and strip
the
cpe:/o?



On 1/24/15, 3:28 PM, Alexandre Rafalovitch wrote:

  You are using keywords here that seem to contradict with each other.

Or your use case is not clear.

Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!

But then at the end, you say you want to search for whatever you
stripped off. So, that should be back in Solr again?

Or are you asking something along these lines:
1. I have a multiValued field with the following sample content... (it
does not matter to Solr where it comes from)
2. I wanted it returned as is, but I want to be able to find documents
when somebody searches for X, Y, or Z
3. What would be the best analyzer chain to be able to do so?

Regards,
  Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 24 January 2015 at 15:04, Carl Roberts <
carl.roberts.zap...@gmail.com>
wrote:

  Hi,

How can I parse the data in a field that is returned from a query?

Basically,

I have a multi-valued field that contains values such as these that are
returned from a query:

 "cpe:/o:freebsd:freebsd:1.1.5.1",
 "cpe:/o:freebsd:freebsd:2.2.3",
 "cpe:/o:freebsd:freebsd:2.2.2",
 "cpe:/o:freebsd:freebsd:2.2.5",
 "cpe:/o:freebsd:freebsd:2.2.4",
 "cpe:/o:freebsd:freebsd:2.0.5",
 "cpe:/o:freebsd:freebsd:2.2.6",
 "cpe:/o:freebsd:freebsd:2.1.6.1",
 "cpe:/o:freebsd:freebsd:2.0.1",
 "cpe:/o:freebsd:freebsd:2.2",
 "cpe:/o:freebsd:freebsd:2.0",
 "cpe:/o:openbsd:openbsd:2.3",
 "cpe:/o:freebsd:freebsd:3.0",
 "cpe:/o:freebsd:freebsd:1.1",
 "cpe:/o:freebsd:freebsd:2.1.6",
 "cpe:/o:openbsd:openbsd:2.4",
 "cpe:/o:bsdi:bsd_os:3.1",
 "cpe:/o:freebsd:freebsd:1.0",
 "cpe:/o:freebsd:freebsd:2.1.7",
 "cpe:/o:freebsd:freebsd:1.2",
 "cpe:/o:freebsd:freebsd:2.1.5",
 "cpe:/o:freebsd:freebsd:2.1.7.1"],

And my problem is that I need to strip the cpe:/o part and I also need
to
tokenize words using the (:) as a separator so that I can then search
for
"freebsd 1.1" or "openbsd 2.4" or just "freebsd".

Thanks in advance.

Joe






What is the recommended way to import and update index records?

2015-01-27 Thread Carl Roberts

Hi,

What is the recommended way to import and update index records?

I've read the documentation and I've experimented with full-import and 
delta-import and I am not seeing the desired results.


Basically, I have 15 RSS feeds that I am importing through 
rss-data-config.xml.


The first RSS feed should be a full import and the ones that follow may 
contain the same id, in which case the existing id in the index should 
be updated from the record in the new RSS feed.  Also there may be new 
records in the RSS feeds that follow the first one, in which case I want 
them added to the index.


When I try full-import for each entity, the index is cleared and I just 
end up with the records for the last import.


When I try full-import for each entity, with the clean=false parameter, 
all the records from each entity are added to the index and I end up 
with duplicate records.


When I try delta-import for the entities the follow the first one, I 
don't get any new index records.


How should I do this?

Regards,

Joe


Re: What is the recommended way to import and update index records?

2015-01-27 Thread Carl Roberts
Also, if I try full-import and clean=false with the same XML file, I end 
up with more records each time the import runs.  How can I make SOLR 
just add the records that are new by id, and update the ones that have 
an id that matches the one in the existing index?



On 1/27/15, 11:32 AM, Carl Roberts wrote:

Hi,

What is the recommended way to import and update index records?

I've read the documentation and I've experimented with full-import and 
delta-import and I am not seeing the desired results.


Basically, I have 15 RSS feeds that I am importing through 
rss-data-config.xml.


The first RSS feed should be a full import and the ones that follow 
may contain the same id, in which case the existing id in the index 
should be updated from the record in the new RSS feed. Also there may 
be new records in the RSS feeds that follow the first one, in which 
case I want them added to the index.


When I try full-import for each entity, the index is cleared and I 
just end up with the records for the last import.


When I try full-import for each entity, with the clean=false 
parameter, all the records from each entity are added to the index and 
I end up with duplicate records.


When I try delta-import for the entities the follow the first one, I 
don't get any new index records.


How should I do this?

Regards,

Joe




Re: Is there a way to pass in proxy settings to Solr?

2015-01-27 Thread Carl Roberts

Hi Shawn,

I got it to work by using this script to start my instance of Solr:

java *-Dhttp.proxyHost=http-proxy-server -Dhttp.http.proxyPort=80 
-Dhttps.proxyHost=http-proxy-server -Dhttps.proxyPort=80* 
-Dlog4j.debug=true 
-Dlog4j.configuration=file:///Users/carlroberts/dev/solr-4.10.3/log4j.xml -Dsolr.solr.home=../ 
-classpath "./:lib/*:./log4j.xml" -jar start.jar



Regards,

Joe
On 1/22/15, 11:46 AM, Shawn Heisey wrote:

On 1/22/2015 9:18 AM, Carl Roberts wrote:

Is there a way to pass in proxy settings to Solr?

The reason that I am asking this question is that I am trying to run
the DIH RSS example, and it is not working when I try to import the
RSS feed URL because the code in Solr comes back with an unknown host
exception due to the proxy that we use at work.

If I use the curl tool and the environment variable http_proxy to
access the RSS feed directly it works, but it appears Solr does not
use that environment variable because it is throwing this error:

39642 [Thread-15] ERROR
org.apache.solr.handler.dataimport.URLDataSource – Exception thrown
while getting data

Checking the code, URLDataSource seems to use the URL capability that
comes with Java itself.  The system properties on this page are very
likely to affect objects that come with Java:

http://docs.oracle.com/javase/7/docs/api/java/net/doc-files/net-properties.html#Proxies

You would need to set these properties on the java commandline that
starts your servlet container, with the -D option.

Thanks,
Shawn





Re: What is the recommended way to import and update index records?

2015-01-27 Thread Carl Roberts
HI Alex, thanks for clarifying this for me.  I'll take a look at my 
setup of the uniqueKey.  Perhaps I did not set it right.



On 1/27/15, 12:09 PM, Alexandre Rafalovitch wrote:

What do you mean by "update"? If you mean partial update, DIH does not
do it AFAIK. If you mean replace, it should.

If you are getting duplicate records, maybe your uniqueKey is not set correctly?

clean=false looks to me like the right approach for incremental updates.

Regards,
Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 27 January 2015 at 11:43, Carl Roberts  wrote:

Also, if I try full-import and clean=false with the same XML file, I end up
with more records each time the import runs.  How can I make SOLR just add
the records that are new by id, and update the ones that have an id that
matches the one in the existing index?



On 1/27/15, 11:32 AM, Carl Roberts wrote:

Hi,

What is the recommended way to import and update index records?

I've read the documentation and I've experimented with full-import and
delta-import and I am not seeing the desired results.

Basically, I have 15 RSS feeds that I am importing through
rss-data-config.xml.

The first RSS feed should be a full import and the ones that follow may
contain the same id, in which case the existing id in the index should be
updated from the record in the new RSS feed. Also there may be new records
in the RSS feeds that follow the first one, in which case I want them added
to the index.

When I try full-import for each entity, the index is cleared and I just
end up with the records for the last import.

When I try full-import for each entity, with the clean=false parameter,
all the records from each entity are added to the index and I end up with
duplicate records.

When I try delta-import for the entities the follow the first one, I don't
get any new index records.

How should I do this?

Regards,

Joe






Cannot reindex to add a new field

2015-01-27 Thread Carl Roberts

Hi,

I have tried to reindex to add a new field named product-info and no 
matter what I do, I cannot get the new field to appear in the index 
after import via DIH.


Here is the rss-data-config.xml configuration (field product-info is the 
new field I added):



readTimeout="3"/>


https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry"
transformer="RegexTransformer">
commonField="false" />
commonField="false" />
commonField="false" />
*xpath="/nvd/entry/vulnerable-software-list/product" commonField="false"/>*
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" 
commonField="false"/>
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false"/>
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />






Here is the section that contains the new product-info field in schema.xml:

 multiValued="true"/>

   
   
   
   
*stored="true" multiValued="true"/>*
   indexed="true" stored="true" multiValued="true"/>
   stored="true" multiValued="true"/>
   stored="true" />
   stored="true" />
   stored="true" />


Field product-info is defined in the same manner as vulnerable-software 
field and it pulls the same data as vulnerable-software list field via 
the same xpath, yet vulnerable-software field shows up in the results 
and product-info field does not.  Here is the response for a query after 
the import takes place - field product-info is missing:


~/dev/solr-4.10.3$ curl 
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=*:*&start=0&&rows=1";

{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "indent":"true",
  "start":"0",
  "q":"*:*",
  "wt":"json",
  "rows":"1"}},
  "response":{"numFound":6717,"start":0,"docs":[
  {
"id":"CVE-1999-0001",
"summary":"ip_input.c in BSD-derived TCP/IP implementations 
allows remote attackers to cause a denial of service (crash or hang) via 
crafted packets.",

"vulnerable-configuration":["cpe:/o:bsdi:bsd_os:3.1",
  "cpe:/o:freebsd:freebsd:1.0",
  "cpe:/o:freebsd:freebsd:1.1",
  "cpe:/o:freebsd:freebsd:1.1.5.1",
  "cpe:/o:freebsd:freebsd:1.2",
  "cpe:/o:freebsd:freebsd:2.0",
  "cpe:/o:freebsd:freebsd:2.0.5",
  "cpe:/o:freebsd:freebsd:2.1.5",
  "cpe:/o:freebsd:freebsd:2.1.6",
  "cpe:/o:freebsd:freebsd:2.1.6.1",
  "cpe:/o:freebsd:freebsd:2.1.7",
  "cpe:/o:freebsd:freebsd:2.1.7.1",
  "cpe:/o:freebsd:freebsd:2.2",
  "cpe:/o:freebsd:freebsd:2.2.3",
  "cpe:/o:freebsd:freebsd:2.2.4",
  "cpe:/o:freebsd:freebsd:2.2.5",
  "cpe:/o:freebsd:freebsd:2.2.6",
  "cpe:/o:freebsd:freebsd:2.2.8",
  "cpe:/o:freebsd:freebsd:3.0",
  "cpe:/o:openbsd:openbsd:2.3",
  "cpe:/o:openbsd:openbsd:2.4",
  "cpe:/o:freebsd:freebsd:2.2.2",
  "cpe:/o:freebsd:freebsd:2.0.1"],
"cve":"CVE-1999-0001",
"cwe":"CWE-20",
"published":"1999-12-30T00:00:00.000-05:00",
"vulnerable-software":["cpe:/o:freebsd:freebsd:2.2.8",
  "cpe:/o:freebsd:freebsd:1.1.5.1",
  "cpe:/o:freebsd:freebsd:2.2.3",
  "cpe:/o:freebsd:freebsd:2.2.2",
  "cpe:/o:freebsd:freebsd:2.2.5",
  "cpe:/o:freebsd:freebsd:2.2.4",
  "cpe:/o:freebsd:freebsd:2.0.5",
  "cpe:/o:freebsd:freebsd:2.2.6",
  "cpe:/o:freebsd:freebsd:2.1.6.1",
  "cpe:/o:freebsd:freebsd:2.0.1",
  "cpe:/o:freebsd:freebsd:2.2",
  "cpe:/o:freebsd:freebsd:2.0",
  "cpe:/o:openbsd:openbsd:2.3",
  "cpe:/o:freebsd:freebsd:3.0",
  "cpe:/o:freebsd:freebsd:1.1",
  "cpe:/o:freebsd:freebsd:2.1.6",
  "cpe:/o:openbsd:openbsd:2.4",
  "cpe:/o:bsdi:bsd_os:3.1",
  "cpe:/o:freebsd:freebsd:1.0",
  "cpe:/o:freebsd:freebsd:2.1.7",
  "cpe:/o:freebsd:freebsd:1.2",
  "cpe:/o:freebsd:freebsd:2.1.5",
  "cpe:/o:freebsd:freebsd:2.1.7.1"],
"modified":"2010-12-16T00:00:00.000-05:00",
"_version_":1491484873657942016}]
  }}

This is what I have tried so far:

Restarted Solr to reload the schema and reimported via full-import and 
clean=true.
Invoked the following command to reload the schema and reimported via 
full-import and clean=true


curl 
"http://localhost:8983/solr/admin/cores?action=RELOAD&core=nvd-rss";


I am not sure why this is not working as everything seems correct to me 
in my setup.  Could this be a bug?


Regards,

Joe




Re: After adding field to schema, the field is not being returned in results.

2015-01-27 Thread Carl Roberts

I too am running into what appears to be the same thing.

Everything works and data is imported but I cannot see the new field in 
the result.


Re: Cannot reindex to add a new field

2015-01-27 Thread Carl Roberts
Well - I got this to work.  Noticed that when log4j is enabled 
product-info was in the import as product-info=[], so I then played with 
the field and got this definition to work in the rss-data-config.xml file:


xpath="/nvd/entry/vulnerable-software-list/product" commonField="false" 
regex=":" replaceWith=" "/>


Don't ask me why the other one didn't work, as I think it should have 
worked also.


On 1/27/15, 3:42 PM, Carl Roberts wrote:

Hi,

I have tried to reindex to add a new field named product-info and no 
matter what I do, I cannot get the new field to appear in the index 
after import via DIH.


Here is the rss-data-config.xml configuration (field product-info is 
the new field I added):



readTimeout="3"/>



url="https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";

processor="XPathEntityProcessor"
forEach="/nvd/entry"
transformer="RegexTransformer">
commonField="false" />
commonField="false" />
commonField="false" />
*xpath="/nvd/entry/vulnerable-software-list/product" commonField="false"/>*
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" commonField="false"/>
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false"/>
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />






Here is the section that contains the new product-info field in 
schema.xml:


 multiValued="true"/>

   
   
   
   
*stored="true" multiValued="true"/>*
   indexed="true" stored="true" multiValued="true"/>
   indexed="true" stored="true" multiValued="true"/>
   stored="true" />
   stored="true" />
   stored="true" />


Field product-info is defined in the same manner as 
vulnerable-software field and it pulls the same data as 
vulnerable-software list field via the same xpath, yet 
vulnerable-software field shows up in the results and product-info 
field does not.  Here is the response for a query after the import 
takes place - field product-info is missing:


~/dev/solr-4.10.3$ curl 
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=*:*&start=0&&rows=1";

{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "indent":"true",
  "start":"0",
  "q":"*:*",
  "wt":"json",
  "rows":"1"}},
  "response":{"numFound":6717,"start":0,"docs":[
  {
"id":"CVE-1999-0001",
"summary":"ip_input.c in BSD-derived TCP/IP implementations 
allows remote attackers to cause a denial of service (crash or hang) 
via crafted packets.",

"vulnerable-configuration":["cpe:/o:bsdi:bsd_os:3.1",
  "cpe:/o:freebsd:freebsd:1.0",
  "cpe:/o:freebsd:freebsd:1.1",
  "cpe:/o:freebsd:freebsd:1.1.5.1",
  "cpe:/o:freebsd:freebsd:1.2",
  "cpe:/o:freebsd:freebsd:2.0",
  "cpe:/o:freebsd:freebsd:2.0.5",
  "cpe:/o:freebsd:freebsd:2.1.5",
  "cpe:/o:freebsd:freebsd:2.1.6",
  "cpe:/o:freebsd:freebsd:2.1.6.1",
  "cpe:/o:freebsd:freebsd:2.1.7",
  "cpe:/o:freebsd:freebsd:2.1.7.1",
  "cpe:/o:freebsd:freebsd:2.2",
  "cpe:/o:freebsd:freebsd:2.2.3",
  "cpe:/o:freebsd:freebsd:2.2.4",
  "cpe:/o:freebsd:freebsd:2.2.5",
  "cpe:/o:freebsd:freebsd:2.2.6",
  "cpe:/o:freebsd:freebsd:2.2.8",
  "cpe:/o:freebsd:freebsd:3.0",
  "cpe:/o:openbsd:openbsd:2.3",
  "cpe:/o:openbsd:openbsd:2.4",
  "cpe:/o:freebsd:freebsd:2.2.2",
  "cpe:/o:freebsd:freebsd:2.0.1"],
"cve":"CVE-1999-0001",
"cwe":"CWE-20",
"published":"1999-12-30T00:00:00.000-05:00",
"vulnerable-software":["cpe:/o:freebsd:freebsd:2.2.8",
  "cpe:/o:freebsd:freebsd:1.1.5.1",
  "cpe:/o:freebsd:freebsd:2.2.3",
  

Re: Cannot reindex to add a new field

2015-01-27 Thread Carl Roberts
reebsd:freebsd:2.2.5",
  "cpe:/o:freebsd:freebsd:2.2.4",
  "cpe:/o:freebsd:freebsd:2.0.5",
  "cpe:/o:freebsd:freebsd:2.2.6",
  "cpe:/o:freebsd:freebsd:2.1.6.1",
  "cpe:/o:freebsd:freebsd:2.0.1",
  "cpe:/o:freebsd:freebsd:2.2",
  "cpe:/o:freebsd:freebsd:2.0",
  "cpe:/o:openbsd:openbsd:2.3",
  "cpe:/o:freebsd:freebsd:3.0",
  "cpe:/o:freebsd:freebsd:1.1",
  "cpe:/o:freebsd:freebsd:2.1.6",
  "cpe:/o:openbsd:openbsd:2.4",
  "cpe:/o:bsdi:bsd_os:3.1",
  "cpe:/o:freebsd:freebsd:1.0",
  "cpe:/o:freebsd:freebsd:2.1.7",
  "cpe:/o:freebsd:freebsd:1.2",
  "cpe:/o:freebsd:freebsd:2.1.5",
  "cpe:/o:freebsd:freebsd:2.1.7.1"],
"modified":"2010-12-16T00:00:00.000-05:00",
"_version_":1491493094540967936}]
  }}

On 1/27/15, 5:19 PM, Alexandre Rafalovitch wrote:

One xpath per field definition. You had two fields for the same xpath.
If they were the same value, the best bet would be to deal with it via
copyField in the schema.

No idea why regex thing makes a difference, are you sure the other
field is also still being indexed?

Regards,
Alex.


Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 27 January 2015 at 17:11, Carl Roberts  wrote:

Well - I got this to work.  Noticed that when log4j is enabled product-info
was in the import as product-info=[], so I then played with the field and
got this definition to work in the rss-data-config.xml file:



Don't ask me why the other one didn't work, as I think it should have worked
also.


On 1/27/15, 3:42 PM, Carl Roberts wrote:

Hi,

I have tried to reindex to add a new field named product-info and no
matter what I do, I cannot get the new field to appear in the index after
import via DIH.

Here is the rss-data-config.xml configuration (field product-info is the
new field I added):


 
 
 https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
 processor="XPathEntityProcessor"
 forEach="/nvd/entry"
 transformer="RegexTransformer">
 
 
 
**
 
 
 
 
 
 
 



Here is the section that contains the new product-info field in
schema.xml:

  




**






Field product-info is defined in the same manner as vulnerable-software
field and it pulls the same data as vulnerable-software list field via the
same xpath, yet vulnerable-software field shows up in the results and
product-info field does not.  Here is the response for a query after the
import takes place - field product-info is missing:

~/dev/solr-4.10.3$ curl
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=*:*&start=0&&rows=1";
{
   "responseHeader":{
 "status":0,
 "QTime":1,
 "params":{
   "indent":"true",
   "start":"0",
   "q":"*:*",
   "wt":"json",
   "rows":"1"}},
   "response":{"numFound":6717,"start":0,"docs":[
   {
 "id":"CVE-1999-0001",
 "summary":"ip_input.c in BSD-derived TCP/IP implementations allows
remote attackers to cause a denial of service (crash or hang) via crafted
packets.",
 "vulnerable-configuration":["cpe:/o:bsdi:bsd_os:3.1",
   "cpe:/o:freebsd:freebsd:1.0",
   "cpe:/o:freebsd:freebsd:1.1",
   "cpe:/o:freebsd:freebsd:1.1.5.1",
   "cpe:/o:freebsd:freebsd:1.2",
   "cpe:/o:freebsd:freebsd:2.0",
   "cpe:/o:freebsd:freebsd:2.0.5",
   "cpe:/o:freebsd:freebsd:2.1.5",
   "cpe:/o:freebsd:freebsd:2.1.6",
   "cpe:/o:freebsd:freebsd:2.1.6.1",
   "cpe:/o:freebsd:freebsd:2.1.7",
   "cpe:/o:freebsd:freebsd:2.1.7.1",
   "cpe:/o:freebsd:freebsd:2.2",
   "cpe:/o:freebsd:freebsd:2.2.3",
   "cpe:/o:freebsd:freebsd:2.2.4",
   "cpe:/o:freebsd:freebsd:2.2.5",
   "cpe:/o:freebsd:freebsd:2.2.6",
   "cpe:/o:freebsd:freebsd:2.2.8",
   "cpe:/o:freebsd:freebsd:3.0",
   "cpe:/o:openbsd:openbsd:2.3",
   "cpe:/o:openbsd:openbsd:2.4",
   "cpe:/o:freebsd:freebsd:2.2.2",
   "cpe:/o:freebsd:freebsd:2.0.1"],
 "cve":"CVE-1999-0001",
 "cwe":"CWE-20",
 "published":"1999-12-30T00:00:00.000-05:00",
 "vulnerable-software":["cpe:/o:freebsd:freebsd:2.2.8",
   "cpe:/o:freebsd:freebsd:1.1.5.1",
   "cpe:/o:freebsd:freebsd:2.2.3",
   "cpe:/o:freebsd:freebsd:2.2.2",
   "cpe:/o:freebsd:freebsd:2.2.5",
   "cpe:/o:freebsd:freebsd:2.2.4",
   "cpe:/o:freebsd:freebsd:2.0.5",
   "cpe:/o:freebsd:freebsd:2.2.6",
   "cpe:/o:freebsd:freebsd:2.1.6.1",
   "cpe:/o:freebsd:freebsd:2.0.1",
   "cpe:/o:freebsd:freebsd:2.2",
   "cpe:/o:freebsd:freebsd:2.0",
   "cpe:/o:openbsd:openbsd:2.3",
   "cpe:/o:freebsd:freebsd:3.0",
   "cpe:/o:freebsd:freebsd:1.1",
   "cpe:/o:freebsd:freebsd:2.1.6",
   "cpe:/o:openbsd:openbsd:2.4",
   "cpe:/o:bsdi:bsd_os:3.1",
   "cpe:/o:freebsd:freebsd:1.0",
   "cpe:/o:freebsd:freebsd:2.1.7",
   "cpe:/o:freebsd:freebsd:1.2",
   "cpe:/o:freebsd:freebsd:2.1.5",
   "cpe:/o:freebsd:freebsd:2.1.7.1"],
 "modified":"2010-12-16T00:00:00.000-05:00",
 "_version_":1491484873657942016}]
   }}

This is what I have tried so far:

Restarted Solr to reload the schema and reimported via full-import and
clean=true.
Invoked the following command to reload the schema and reimported via
full-import and clean=true

 curl
"http://localhost:8983/solr/admin/cores?action=RELOAD&core=nvd-rss";

I am not sure why this is not working as everything seems correct to me in
my setup.  Could this be a bug?

Regards,

Joe






Re: What is the recommended way to import and update index records?

2015-01-27 Thread Carl Roberts
OK - I did a little testing and with full-import and clean=false, I get 
more and more records when I import the same XML file. I have also 
checked and I see that my uniqueKey is defined correctly.


Here are my fields in schema.xml:

multiValued="true"/>

   
   
   
   
   indexed="true" stored="true" multiValued="true"/>
   stored="true" multiValued="true"/>
   stored="true" multiValued="true"/>
   stored="true" />
   stored="true" />
   stored="true" />
   stored="true" />
   stored="true" />
   indexed="true" stored="true" />
   stored="true" />
   indexed="true" stored="true" />
   indexed="true" stored="true" />
   indexed="true" stored="true" />
   stored="true" multiValued="true"/>
   stored="true" />


And here is uniqueKey in schema.xml:

id


Here is my rss-data-config.xml:


readTimeout="3"/>


https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry"
transformer="RegexTransformer">
commonField="false" />
commonField="false" />
commonField="false" />
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name" 
commonField="false"/>
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false"/>
commonField="false" regex="cpe:/.:" replaceWith=""/>
replaceWith=" "/>
xpath="/nvd/entry/published-datetime" commonField="false" />
xpath="/nvd/entry/last-modified-datetime" commonField="false" />
commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/score" commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/access-vector" commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/access-complexity" 
commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/authentication" commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/confidentiality-impact" 
commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/integrity-impact" commonField="false" />
xpath="/nvd/entry/cvss/base_metrics/availability-impact" 
commonField="false" />
xpath="/nvd/entry/references/reference/@href" commonField="false" />
xpath="/nvd/entry/security-protection" commonField="false" />





Here is the import command the first time:

*curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&entity=cve-2002&clean=true"*


Here is the command that outputs the count of records:

*curl 
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=*:*&start=0&&rows=0&fl=*"*


And here is the output:

{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "fl":"*",
  "indent":"true",
  "start":"0",
  "q":"*:*",
  "wt":"json",
  "rows":"0"}},
  "response":{"numFound":6717,"start":0,"docs":[]
  }}

Now here is the next full-import command with clean=false:

*"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&entity=cve-2002&clean=false"*

And here is the new count:

*curl 
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=*:*&start=0&&rows=0&fl=*"*


{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "fl":"*",
  "indent":"true",
  "start":"0",
  "q":"*:*",
  "wt":"json",
  "rows":"0"}},
  "response":{"numFound":13434,"start":0,"docs":[]
  }}

Clearly, this is just importing the same records twice.


What is even more puzzling that if I search for an id value which is 
unique in the imported XML, I get all records back:


curl 
"http://localhost:8983/solr/nvd-rss/select?wt

Re: What is the recommended way to import and update index records?

2015-01-27 Thread Carl Roberts

Yep - it works with string.  Thanks a lot!



On 1/27/15, 7:08 PM, Alexandre Rafalovitch wrote:



Make that id field a string and reindex. text_general is not the right
type for a unique key.

Regards,
   Alex.




Running multiple full-import commands via curl in a script

2015-01-27 Thread Carl Roberts

Hi,

I am attempting to run all these curl commands from a script so that I 
can put them in a crontab job, however, it seems that only the first one 
executes and the other ones return with an error (below):


curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2002";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2003";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2004";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2005";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2006";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2007";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2008";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2009";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2010";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2011";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2012";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2013";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2014";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&entity=cve-2015";
curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=delta-import&clean=false&entity=cve-last";


error:

*A command is still running...*

Question:  Is there a way to queue the other requests in Solr so that 
they run as soon as the previous one is done?  If not, how would you 
recommend I do this?


Many thanks in advance,

Joe




Re: Running multiple full-import commands via curl in a script

2015-01-28 Thread Carl Roberts

Thanks Mikhail - synchronous=true works like a charm...:)

On 1/28/15, 5:16 AM, Mikhail Khludnev wrote:

Literally, queue can be done by submitting as is (async) and polling
command status. However, giving
https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java#L200
you can try to add &synchronous=true&... that should hang request until
it's completed.
The other question is how run requests in parallel which is explicitly
violated by
https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java#L173
  The only workaround I can suggest is to duplicate DIH definitions in solr
config
...
...
...
  ...
then those guys should be able to handle own request in parallel. Nasty
stuff..
have a good hack

On Wed, Jan 28, 2015 at 3:47 AM, Carl Roberts 
wrote:
Hi,

I am attempting to run all these curl commands from a script so that I can
put them in a crontab job, however, it seems that only the first one
executes and the other ones return with an error (below):

curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2002"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2003"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2004"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2005"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2006"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2007"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2008"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2009"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2010"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2011"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2012"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2013"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2014"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
full-import&clean=false&entity=cve-2015"
curl "http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=
delta-import&clean=false&entity=cve-last"

error:

*A command is still running...*

Question:  Is there a way to queue the other requests in Solr so that they
run as soon as the previous one is done?  If not, how would you recommend I
do this?

Many thanks in advance,

Joe









What is the best way to update an index?

2015-01-28 Thread Carl Roberts

Hi,

What is the best way to update an index with new data or records? Via 
this command:


curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&synchronous=true&entity=cve-2002";


or this command:

curl 
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=delta-import&synchronous=true&entity=cve-2002";



Thanks,

Joe


How do I unsubscribe?

2015-06-08 Thread Carl Roberts

How do I unsubscribe?


Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Hi,

I have downloaded the code and documentation for Solr version 4.10.3.

I am trying to follow SolrJ Wiki guide and I am running into errors.  
The latest error is this one:


Exception in thread "main" org.apache.solr.common.SolrException: No such 
core: db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)

My code is this:

package solr;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;
import org.apache.solr.core.SolrCore;


public class Test {
public static void main(String [] args){
CoreContainer container = new 
CoreContainer("/Users/carlroberts/dev/solr-4.10.3");

System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded("db"));
System.out.println(container.getCoreInitFailures());
Collection cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, 
"db" );

SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField( "name", "doc1", 1.0f );
doc1.addField( "price", 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( "id", "id2", 1.0f );
doc2.addField( "name", "doc2", 1.0f );
doc2.addField( "price", 20 );
Collection docs = new
ArrayList();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( "*:*" );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
}


My solr.xml file is this:







  

  


And my db/conf directory was copied from example/solr/collection/conf 
directory and it contains the solrconfig.xml file and schema.xml file.


I have noticed that the documentation that shows how to use the 
EmbeddedSolarServer is outdated as it indicates I should use 
CoreContainer.Initializer class which doesn't exist, and 
container.load(path, file) which also doesn't exist.


At this point I have no idea why I am getting the No such core error and 
I have googled it and there seems to be tons of threads showing this 
error but for different reasons, and I have tried all the suggested 
resolutions and get nowhere with this.


Can you please help?

Regards,

Joe


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
So far I have not been able to get the logging to work - here is what I 
get in the console prior to the exception:


SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for 
further details.

db
/Users/carlroberts/dev/solr-4.10.3/
false
{}
[]
/Users/carlroberts/dev/solr-4.10.3/


On 1/21/15, 11:50 AM, Alan Woodward wrote:

That certainly looks like it ought to work.  Is there log output that you could 
show us as well?

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:09, Carl Roberts wrote:


Hi,

I have downloaded the code and documentation for Solr version 4.10.3.

I am trying to follow SolrJ Wiki guide and I am running into errors.  The 
latest error is this one:

Exception in thread "main" org.apache.solr.common.SolrException: No such core: 
db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)

My code is this:

package solr;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;
import org.apache.solr.core.SolrCore;


public class Test {
public static void main(String [] args){
CoreContainer container = new 
CoreContainer("/Users/carlroberts/dev/solr-4.10.3");
System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded("db"));
System.out.println(container.getCoreInitFailures());
Collection cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, "db" );
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField( "name", "doc1", 1.0f );
doc1.addField( "price", 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( "id", "id2", 1.0f );
doc2.addField( "name", "doc2", 1.0f );
doc2.addField( "price", 20 );
Collection docs = new
ArrayList();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( "*:*" );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
}


My solr.xml file is this:







  

  


And my db/conf directory was copied from example/solr/collection/conf directory 
and it contains the solrconfig.xml file and schema.xml file.

I have noticed that the documentation that shows how to use the 
EmbeddedSolarServer is outdated as it indicates I should use 
CoreContainer.Initializer class which doesn't exist, and container.load(path, 
file) which also doesn't exist.

At this point I have no idea why I am getting the No such core error and I have 
googled it and there seems to be tons of threads showing this error but for 
different reasons, and I have tried all the suggested resolutions and get 
nowhere with this.

Can you please help?

Regards,

Joe






Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Hi,

Could there be a bug in the EmbeddedSolrServer that is causing this?

Is it still supported in version 4.10.3?

If it is, can someone please provide me assistance with this?

Regards,

Joe

On 1/21/15, 12:18 PM, Carl Roberts wrote:

I had to hardcode the path in solrconfig.xml from this:

${solr.install.dir:}

to this:

 /Users/carlroberts/dev/solr-4.10.3/


to avoid the classloader warnings, but I still get the same error. I 
am not sure where the ${solr.install.dir:} value gets pulled from but 
apparently that is not working.  Here is the new output:


[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' 
to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' 
to classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
1023143764
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]

db
/Users/carlroberts/dev/solr-4.10.3/
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0&connTimeout=0&retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' 
to classloader
[coreLoadExecutor-5-t

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
ema.IndexSchema - 
Reading Solr Schema from 
/Users/carlroberts/dev/solr-4.10.3/db/conf/schema.xml
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - 
[db] Schema name=example

false
{}
[]
/Users/carlroberts/dev/solr-4.10.3/
Exception in thread "main" org.apache.solr.common.SolrException: No such 
core: db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:40)

On 1/21/15, 11:50 AM, Alan Woodward wrote:

That certainly looks like it ought to work.  Is there log output that you could 
show us as well?

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:09, Carl Roberts wrote:


Hi,

I have downloaded the code and documentation for Solr version 4.10.3.

I am trying to follow SolrJ Wiki guide and I am running into errors.  The 
latest error is this one:

Exception in thread "main" org.apache.solr.common.SolrException: No such core: 
db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:39)

My code is this:

package solr;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;
import org.apache.solr.core.SolrCore;


public class Test {
public static void main(String [] args){
CoreContainer container = new 
CoreContainer("/Users/carlroberts/dev/solr-4.10.3");
System.out.println(container.getDefaultCoreName());
System.out.println(container.getSolrHome());
container.load();
System.out.println(container.isLoaded("db"));
System.out.println(container.getCoreInitFailures());
Collection cores = container.getCores();
System.out.println(cores);
EmbeddedSolrServer server = new EmbeddedSolrServer( container, "db" );
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField( "name", "doc1", 1.0f );
doc1.addField( "price", 10 );
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( "id", "id2", 1.0f );
doc2.addField( "name", "doc2", 1.0f );
doc2.addField( "price", 20 );
Collection docs = new
ArrayList();
docs.add( doc1 );
docs.add( doc2 );
try{
server.add( docs );
server.commit();
server.deleteByQuery( "*:*" );
}catch(IOException e){
e.printStackTrace();
}catch(SolrServerException e){
e.printStackTrace();
}
}
}


My solr.xml file is this:







  

  


And my db/conf directory was copied from example/solr/collection/conf directory 
and it contains the solrconfig.xml file and schema.xml file.

I have noticed that the documentation that shows how to use the 
EmbeddedSolarServer is outdated as it indicates I should use 
CoreContainer.Initializer class which doesn't exist, and container.load(path, 
file) which also doesn't exist.

At this point I have no idea why I am getting the No such core error and I have 
googled it and there seems to be tons of threads showing this error but for 
different reasons, and I have tried all the suggested resolutions and get 
nowhere with this.

Can you please help?

Regards,

Joe






Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
b/hppc-0.5.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/jackson-core-asl-1.9.13.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/jackson-mapper-asl-1.9.13.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/mahout-collections-1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/mahout-math-0.6.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/simple-xml-2.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/dist/solr-clustering-4.10.3.jar' to 
classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/langid/lib/jsonic-1.2.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/langid/lib/langdetect-1.1-20120112.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/dist/solr-langid-4.10.3.jar' to 
classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/commons-beanutils-1.8.3.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/commons-collections-3.2.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/velocity-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/velocity-tools-2.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/dist/solr-velocity-4.10.3.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.update.SolrIndexConfig - IndexWriter infoStream solr 
logging is enabled
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Using Lucene MatchVersion: 4.10.3
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.Config - Loaded 
SolrConfig: solrconfig.xml
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - 
Reading Solr Schema from 
/Users/carlroberts/dev/solr-4.10.3/db/conf/schema.xml
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - 
[db] Schema name=example

false
{}
[]
/Users/carlroberts/dev/solr-4.10.3/
Exception in thread "main" org.apache.solr.common.SolrException: No such 
core: db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:40)
On 1/21/15, 12:01 PM, Carl Roberts wrote:
OK - I figured out the logging.  Here is the logging output plus the 
console output and the stack trace:


main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' 
to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' 
to classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
2050551931

db
/Users/carlroberts/dev/solr-4.10.3/[main] INFO 
org.apache.solr.core.CoreContainer -

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
g.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/jackson-mapper-asl-1.9.13.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/mahout-collections-1.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/mahout-math-0.6.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/clustering/lib/simple-xml-2.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/dist/solr-clustering-4.10.3.jar' to 
classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/langid/lib/jsonic-1.2.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/langid/lib/langdetect-1.1-20120112.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/dist/solr-langid-4.10.3.jar' to 
classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/commons-beanutils-1.8.3.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/commons-collections-3.2.1.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/velocity-1.7.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/velocity/lib/velocity-tools-2.0.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/dist/solr-velocity-4.10.3.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.update.SolrIndexConfig - IndexWriter infoStream solr 
logging is enabled
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Using Lucene MatchVersion: 4.10.3
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.Config - Loaded 
SolrConfig: solrconfig.xml
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - 
Reading Solr Schema from 
/Users/carlroberts/dev/solr-4.10.3/db/conf/schema.xml
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - 
[db] Schema name=example

default core name=db
solr home=/Users/carlroberts/dev/solr-4.10.3/
db is loaded=false
core init failures={}
cores=[]
Exception in thread "main" org.apache.solr.common.SolrException: No such 
core: db
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:38)

On 1/21/15, 12:31 PM, Alan Woodward wrote:

Ah, OK, you need to include a logging jar in your classpath - the log4j and 
slf4j-log4j jars in the solr distribution will help here.  Once you've got some 
logging set up, then you should be able to work out what's going wrong!

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:53, Carl Roberts wrote:


So far I have not been able to get the logging to work - here is what I get in 
the console prior to the exception:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
db
/Users/carlroberts/dev/solr-4.10.3/
false
{}
[]
/Users/carlroberts/dev/solr-4.10.3/


On 1/21/15, 11:50 AM, Alan Woodward wrote:

That certainly looks like it ought to work.  Is there log output that you could 
show us as well?

Alan Woodward
www.flax.co.uk


On 21 Jan 2015, at 16:09, Carl Roberts wrote:


Hi,

I have downloaded the code and documentation for Solr version 4.10.3.

I am trying to follow SolrJ Wiki guide and I am running into errors.  The 
latest error is this one:

Exception in thread "main" org.apache.solr.common.SolrException: No such cor

Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Carl Roberts

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several elements 
in each node that I have to index, so I was planning to parse the XML 
with Stax and extract the data from each node and add it to Solr.  There 
will always be only one one file to start with and then a second file as 
the RSS feeds supplies updates.  I want to return certain fields of each 
node when I search certain fields of the same node.  Is Solr overkill in 
this case?  Should I just use Lucene instead?


Regards,

Joe


Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
Ah - OK - let me try that.   BTW - I applied the fix from the bug link 
you gave me to log the errors and I am now at least getting the actual 
errors:


*default core name=db
solr home=/Users/carlroberts/dev/solr-4.10.3/
db is loaded=false
core init 
failures={db=org.apache.solr.core.CoreContainer$CoreLoadFailure@4d351f9b}

cores=[]
Exception in thread "main" org.apache.solr.common.SolrException: 
SolrCore 'db' is not available due to init failure: JVM Error creating 
core [db]: org/apache/lucene/queries/function/ValueSource

at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:749)
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:110)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at solr.Test.main(Test.java:38)
Caused by: org.apache.solr.common.SolrException: JVM Error creating core 
[db]: org/apache/lucene/queries/function/ValueSource

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:508)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: 
org/apache/lucene/queries/function/ValueSource

at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:484)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:521)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:81)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:486)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:166)
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
... 6 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.lucene.queries.function.ValueSource

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 21 more
*
On 1/21/15, 7:32 PM, Shawn Heisey wrote:

On 1/21/2015 5:16 PM, Carl Roberts wrote:

BTW - it seems that is very hard to get started with the Embedded
server.  The doc is out of date.  The code seems to be untested and buggy.

On 1/21/15, 7:15 PM, Carl Roberts wrote:

HmmmIt looks like FutureTask is calling setException(Throwable t)
with this exception which is not making it to the console.

What I don't understand is why it is throwing that exception.  I made
sure that I added lucene-queries-4.10.3.jar file to the classpath by
adding it to the solr home directory.  See the new tracing:

I'm pretty sure that all the lucene jars need to be available *before*
Solr reaches the point in the log that you have quoted, where it adds
jars from ${solr.solr.home}/lib.  This would be the same location where
the solrj and solr-core jars live.  The only kind of jars that should be
in the solr home lib directory are extra jars for extra features that
you might specify in schema.xml (or some places in solrconfig.xml), like
the ICU analysis jars, tika, mysql, etc.

Thanks,
Shawn





Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts
ngInfoStream - [IW][main]: now 
flush at close
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: start 
flush: applyAllDeletes=true
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: index 
before flush
[main] INFO org.apache.solr.update.LoggingInfoStream - [DW][main]: 
startFullFlush
[main] INFO org.apache.solr.update.LoggingInfoStream - [DW][main]: main 
finishFullFlush success=true
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: apply 
all deletes during flush
[main] INFO org.apache.solr.update.LoggingInfoStream - [BD][main]: prune 
sis=segments_3:  minGen=9223372036854775807 packetCount=0
[main] INFO org.apache.solr.update.LoggingInfoStream - [CMS][main]: now 
merge
[main] INFO org.apache.solr.update.LoggingInfoStream - [CMS][main]:   
index:
[main] INFO org.apache.solr.update.LoggingInfoStream - [CMS][main]:   no 
more merges pending; now return
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
waitForMerges
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
waitForMerges done
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: start
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: enter lock
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: now prepare
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
prepareCommit: flush
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: index 
before flush
[main] INFO org.apache.solr.update.LoggingInfoStream - [DW][main]: 
startFullFlush
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: apply 
all deletes during flush
[main] INFO org.apache.solr.update.LoggingInfoStream - [BD][main]: prune 
sis=segments_3:  minGen=9223372036854775807 packetCount=0
[main] INFO org.apache.solr.update.LoggingInfoStream - [DW][main]: main 
finishFullFlush success=true
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
startCommit(): start
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
startCommit index= changeCount=4
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: done 
all syncs: []
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: pendingCommit != null
[main] INFO org.apache.solr.update.LoggingInfoStream - [IFD][main]: now 
checkpoint "" [0 segments ; isCommit = true]
[main] INFO org.apache.solr.core.SolrCore - SolrDeletionPolicy.onCommit: 
commits: num=2
commit{dir=NRTCachingDirectory(MMapDirectory@/Users/carlroberts/dev/solr-4.10.3/db/data/index 
lockFactory=NativeFSLockFactory@/Users/carlroberts/dev/solr-4.10.3/db/data/index; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_4,generation=4}
commit{dir=NRTCachingDirectory(MMapDirectory@/Users/carlroberts/dev/solr-4.10.3/db/data/index 
lockFactory=NativeFSLockFactory@/Users/carlroberts/dev/solr-4.10.3/db/data/index; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_5,generation=5}

[main] INFO org.apache.solr.core.SolrCore - newest commit generation = 5
[main] INFO org.apache.solr.update.LoggingInfoStream - [IFD][main]: 
deleteCommits: now decRef commit "segments_4"
[main] INFO org.apache.solr.update.LoggingInfoStream - [IFD][main]: 
delete "segments_4"
[main] INFO org.apache.solr.update.LoggingInfoStream - [IFD][main]: 0 
msec to checkpoint
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: wrote segments file "segments_5"
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: took 5.4 msec
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
commit: done

[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: rollback
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: all 
running merges have aborted
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
rollback: done finish merges

[main] INFO org.apache.solr.update.LoggingInfoStream - [DW][main]: abort
[main] INFO org.apache.solr.update.LoggingInfoStream - [DW][main]: done 
abort; abortedFiles=[] success=true
[main] INFO org.apache.solr.update.LoggingInfoStream - [IW][main]: 
rollback: infos=
[main] INFO org.apache.solr.update.LoggingInfoStream - [IFD][main]: now 
checkpoint "" [0 segments ; isCommit = false]
[main] INFO org.apache.solr.update.LoggingInfoStream - [IFD][main]: 0 
msec to checkpoint
[main] INFO org.apache.solr.core.SolrCore - [db] Closing main searcher 
on request.
[main] INFO org.apache.solr.core.CachingDirectoryFactory - Closing 
NRTCachingDirectoryFactory - 2 directories currently being tracked
[main] INFO org.apache.solr.core.CachingDirectoryFactory - looking to 
close /Users/carlroberts/dev/solr-4.10.3/db/data/index 
[CachedDir<>]
[main] INFO org.apache.solr.core.CachingDirectoryFactory - Closing 
directory: /Users/carlroberts/dev/solr-4.10.3/db/data/index
[main] INFO org.apache.solr.core.CachingDirectory

Re: Errors using the Embedded Solar Server

2015-01-21 Thread Carl Roberts

Got it all working...:)

I just replaced the solrconfig.xml and schema.xml files that I was using 
with the ones from collection1 in one of the examples.  I had modified 
those files to remove certain sections which I thought were not needed 
and apparently I don't understand those files very well yet...:)


Many thanks,

Joe

On 1/21/15, 8:47 PM, Carl Roberts wrote:

Hi Shawn,

Many thanks for all your help.  Moving the lucene JARs from 
solr.solr.home/lib to the same classpath directory as the solr JARs 
plus adding a bunch more dependency JAR files and most of the files 
from the collection1/conf directory - these ones to be exact, has me a 
lot closer to my goal:


rw-r--r--   1 carlroberts  staff 38 Jan 21 20:41 _rest_managed.json
-rw-r--r--   1 carlroberts  staff 56 Jan 21 20:41 
_schema_analysis_stopwords_english.json

-rw-r--r--   1 carlroberts  staff   4041 Dec 10 00:37 currency.xml
-rw-r--r--   1 carlroberts  staff   1386 Dec 10 00:37 elevate.xml
drwxr-xr-x  41 carlroberts  staff   1394 Dec 10 00:37 lang
-rw-r--r--   1 carlroberts  staff894 Dec 10 00:37 protwords.txt
-rw-r--r--@  1 carlroberts  staff  62063 Jan 21 13:02 schema.xml
-rw-r--r--@  1 carlroberts  staff  76821 Jan 21 13:03 solrconfig.xml
-rw-r--r--   1 carlroberts  staff 16 Dec 10 00:37 spellings.txt
-rw-r--r--   1 carlroberts  staff795 Dec 10 00:37 stopwords.txt
-rw-r--r--   1 carlroberts  staff   1148 Dec 10 00:37 synonyms.txt


I am now getting this:

[main] INFO org.apache.solr.core.SolrResourceLoader - new 
SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/'
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' 
to classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to 
classloader
[main] INFO org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' 
to classloader
[main] INFO org.apache.solr.core.ConfigSolr - Loading container 
configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml
[main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 
139145087
[main] INFO org.apache.solr.core.CoreContainer - Loading cores into 
CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/]
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting socketTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting urlScheme to: null
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting connTimeout to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxConnectionsPerHost to: 20
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting corePoolSize to: 0
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maximumPoolSize to: 2147483647
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting maxThreadIdleTime to: 5
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting sizeOfQueue to: -1
[main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory 
- Setting fairnessPolicy to: false
[main] INFO org.apache.solr.update.UpdateShardHandler - Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=0&connTimeout=0&retry=false
[main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is 
org.slf4j.impl.SimpleLoggerFactory

[main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured
[main] INFO org.apache.solr.core.CoreContainer - Host Name: null
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for 
directory: '/Users/carlroberts/dev/solr-4.10.3/db/'
[coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - 
Adding specified lib dirs to ClassLoader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' 
to classloader
[coreLoadExecutor-5-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader - Adding 
'file:/Users/carlroberts/dev/

Is there a way to pass in proxy settings to Solr?

2015-01-22 Thread Carl Roberts

Hi,

Is there a way to pass in proxy settings to Solr?

The reason that I am asking this question is that I am trying to run the 
DIH RSS example, and it is not working when I try to import the RSS feed 
URL because the code in Solr comes back with an unknown host exception 
due to the proxy that we use at work.


If I use the curl tool and the environment variable http_proxy to access 
the RSS feed directly it works, but it appears Solr does not use that 
environment variable because it is throwing this error:


39642 [Thread-15] ERROR org.apache.solr.handler.dataimport.URLDataSource 
– Exception thrown while getting data

java.net.UnknownHostException: rss.slashdot.org
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at 
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932)
at 
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:98)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:42)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:283)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:224)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:204)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)


Thanks in advance,

Joe



Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-22 Thread Carl Roberts

Thanks.  I am looking at the RSS DIH example right now.


On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote:

Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts  wrote:

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several elements in
each node that I have to index, so I was planning to parse the XML with Stax
and extract the data from each node and add it to Solr.  There will always
be only one one file to start with and then a second file as the RSS feeds
supplies updates.  I want to return certain fields of each node when I
search certain fields of the same node.  Is Solr overkill in this case?
Should I just use Lucene instead?

Regards,

Joe




Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-22 Thread Carl Roberts
Thanks for the input.  I think one benefit of using Solr is also that I 
can provide a REST API to search the indexed records.


Regards,

Joe
On 1/21/15, 3:17 PM, Shawn Heisey wrote:

On 1/21/2015 12:53 PM, Carl Roberts wrote:

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several
elements in each node that I have to index, so I was planning to parse
the XML with Stax and extract the data from each node and add it to
Solr.  There will always be only one one file to start with and then a
second file as the RSS feeds supplies updates.  I want to return
certain fields of each node when I search certain fields of the same
node.  Is Solr overkill in this case?  Should I just use Lucene instead?

Effectively, Solr *is* Lucene.  You edit configuration files instead of
writing Lucene code, because Solr is a fully customizable search server,
not a programming API.  That also means that it's not as flexible as
Lucene ... but it's a lot easier.

If you're capable of writing Lucene code, chances are that you'll be
able to write an application that is highly tailored to your situation
that will have better performance than Solr ... but you'll be writing
the entire program yourself.  Solr lets you install an existing program
and just change the configuration.

Thanks,
Shawn





How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Carl Roberts

Hi,

How do you query a sentence composed of multiple words in a description 
field?


I want to search for sentence "Oracle Fusion Middleware" but when I try 
the following search query in curl, I get nothing:


curl "http://localhost:8983/solr/nvd-rss/select?q=summary:Oracle Fusion 
Middleware&wt=xml&indent=true"


If I actually try using "Oracle+Fusion+Middleware" I get hits with 
Oracle or Fusion or Middleware but not just the ones with the string 
"Oracle Fusion Middleware".


This is the response:





  0
  1
  
true
summary:Oracle Fusion Middleware
xml
  


  
CVE-2014-6526
Unspecified vulnerability in the Oracle 
Directory Server Enterprise Edition component in Oracle Fusion 
Middleware 7.0 allows remote attackers to affect integrity via unknown 
vectors related to Admin Console.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6526

1491039690408591361
  
CVE-2014-6548
Unspecified vulnerability in the Oracle SOA 
Suite component in Oracle Fusion Middleware 11.1.1.7 allows local users 
to affect confidentiality, integrity, and availability via vectors 
related to B2B Engine.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6548

1491039690410688513
  
CVE-2014-6580
Unspecified vulnerability in the Oracle Reports 
Developer component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 
allows remote attackers to affect integrity via unknown vectors.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6580

149103969042432
  
CVE-2014-6594
Unspecified vulnerability in the Oracle 
iLearning component in Oracle iLearning 6.0 and 6.1 allows remote 
attackers to affect confidentiality via unknown vectors related to 
Learner Pages.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6594

1491039690435854337
  
CVE-2015-0372
Unspecified vulnerability in the Oracle 
Containers for J2EE component in Oracle Fusion Middleware 10.1.3.5 
allows remote attackers to affect confidentiality via unknown vectors.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0372

1491039690456825857
  
CVE-2015-0376
Unspecified vulnerability in the Oracle 
WebCenter Content component in Oracle Fusion Middleware 11.1.1.8.0 
allows remote attackers to affect integrity via unknown vectors related 
to Content Server.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0376

1491039690458923008
  
CVE-2015-0420
Unspecified vulnerability in the Oracle Forms 
component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows 
remote attackers to affect confidentiality via unknown vectors related 
to Forms Services.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0420

1491039690481991681
  
CVE-2015-0436
Unspecified vulnerability in the Oracle 
iLearning component in Oracle iLearning 6.0 and 6.1 allows remote 
attackers to affect confidentiality via unknown vectors related to 
Login.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0436

1491039690488283137
  
CVE-2014-6525
Unspecified vulnerability in the Oracle Web 
Applications Desktop Integrator component in Oracle E-Business Suite 
11.5.10.2, 12.0.6, 12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows remote 
authenticated users to affect integrity via unknown vectors related to 
Templates.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6525

1491039690408591360
  
CVE-2014-6556
Unspecified vulnerability in the Oracle 
Applications DBA component in Oracle E-Business Suite 11.5.10.2, 12.0.6, 
12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows remote authenticated users to 
affect confidentiality, integrity, and availability via vectors related 
to AD_DDL.
name="link">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6556

1491039690412785664




Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Carl Roberts

Hi Walter,

If I try this from my Mac shell:

curl 
http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=summary:"Oracle 
Fusion"


I don't get a response.

If I try this, it works!:

curl 
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=name:Oracle";


So I think the entire curl url needs to be in quotes in the command line 
and my problem is that I do not know how to put the url in quotes and 
then the field value in quotes inside that.


BTW - If I try the first URL from a browser, it works just fine.

Any suggestions?

On 1/22/15, 5:54 PM, Walter Underwood wrote:

Your query is this:

summary:Oracle Fusion Middleware

That searches for “Oracle” in the summary field and “Fusion” and “Middleware” 
in whatever your default field is.

You want:

summary:”Oracle Fusion Middleware”

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Jan 22, 2015, at 2:47 PM, Carl Roberts  wrote:


Hi,

How do you query a sentence composed of multiple words in a description field?

I want to search for sentence "Oracle Fusion Middleware" but when I try the 
following search query in curl, I get nothing:

curl "http://localhost:8983/solr/nvd-rss/select?q=summary:Oracle Fusion 
Middleware&wt=xml&indent=true"

If I actually try using "Oracle+Fusion+Middleware" I get hits with Oracle or Fusion or 
Middleware but not just the ones with the string "Oracle Fusion Middleware".

This is the response:





  0
  1
  
true
summary:Oracle Fusion Middleware
xml
  


  
CVE-2014-6526
Unspecified vulnerability in the Oracle Directory Server 
Enterprise Edition component in Oracle Fusion Middleware 7.0 allows remote attackers to affect 
integrity via unknown vectors related to Admin Console.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6526
1491039690408591361
  
CVE-2014-6548
Unspecified vulnerability in the Oracle SOA Suite component 
in Oracle Fusion Middleware 11.1.1.7 allows local users to affect confidentiality, integrity, and 
availability via vectors related to B2B Engine.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6548
1491039690410688513
  
CVE-2014-6580
Unspecified vulnerability in the Oracle Reports Developer 
component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows remote attackers to affect 
integrity via unknown vectors.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6580
149103969042432
  
CVE-2014-6594
Unspecified vulnerability in the Oracle iLearning component 
in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect confidentiality via unknown vectors 
related to Learner Pages.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6594
1491039690435854337
  
CVE-2015-0372
Unspecified vulnerability in the Oracle Containers for J2EE 
component in Oracle Fusion Middleware 10.1.3.5 allows remote attackers to affect confidentiality via 
unknown vectors.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0372
1491039690456825857
  
CVE-2015-0376
Unspecified vulnerability in the Oracle WebCenter Content 
component in Oracle Fusion Middleware 11.1.1.8.0 allows remote attackers to affect integrity via 
unknown vectors related to Content Server.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0376
1491039690458923008
  
CVE-2015-0420
Unspecified vulnerability in the Oracle Forms component in 
Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows remote attackers to affect confidentiality via 
unknown vectors related to Forms Services.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0420
1491039690481991681
  
CVE-2015-0436
Unspecified vulnerability in the Oracle iLearning component 
in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect confidentiality via unknown vectors 
related to Login.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0436
1491039690488283137
  
CVE-2014-6525
Unspecified vulnerability in the Oracle Web Applications 
Desktop Integrator component in Oracle E-Business Suite 11.5.10.2, 12.0.6, 12.1.3, 12.2.2, 12.2.3, 
and 12.2.4 allows remote authenticated users to affect integrity via unknown vectors related to 
Templates.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6525
1491039690408591360
  
CVE-2014-6556
Unspecified vulnerability in the Oracle Applications DBA 
component in Oracle E-Business Suite 11.5.10.2, 12.0.6, 12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows 
remote authenticated users to affect confidentiality, integrity, and availability via vectors related 
to AD_DDL.
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6556
1491039690412785664






Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Carl Roberts
Thanks Shawn - I tried this but it does not work.  I don't even get a 
response from curl when I try that format and when I look at the logging 
on the console for Jetty I don't see anything new - it seems that the 
request is not even making it to the server.



On 1/22/15, 6:43 PM, Shawn Heisey wrote:

On 1/22/2015 4:31 PM, Carl Roberts wrote:

Hi Walter,

If I try this from my Mac shell:

 curl
http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=summary:"Oracle
Fusion"

I don't get a response.

Quotes are a special character to the shell on your mac, and get removed
from what the curl command sees.  You'll need to put the whole thing in
quotes (so that characters like & are not interpreted by the shell) and
then escape the quotes that you want to actually be handled by curl:

curl
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=summary:\"Oracle
Fusion\""

Thanks,
Shawn





Re: How do you query a sentence composed of multiple words in a description field?

2015-01-23 Thread Carl Roberts

Thanks Erick,

I think I am going to start using the browser for testing...:) Perhaps 
also a REST client for the Mac.


Regards,

Joe

On 1/22/15, 6:56 PM, Erick Erickson wrote:

Have you considered using the admin/query form? Lots of escaping is done
there for you. Once you have the form of the query down and know what to
expect, it's probably easier to enter "escaping hell" with curl and the
like

And what is your schema definition for the field in question? the
admin/analysis page can help a lot here.

Best,
Erick

On Thu, Jan 22, 2015 at 3:51 PM, Carl Roberts 
wrote:
Thanks Shawn - I tried this but it does not work.  I don't even get a
response from curl when I try that format and when I look at the logging on
the console for Jetty I don't see anything new - it seems that the request
is not even making it to the server.



On 1/22/15, 6:43 PM, Shawn Heisey wrote:


On 1/22/2015 4:31 PM, Carl Roberts wrote:


Hi Walter,

If I try this from my Mac shell:

  curl
http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=summary
:"Oracle
Fusion"

I don't get a response.


Quotes are a special character to the shell on your mac, and get removed
from what the curl command sees.  You'll need to put the whole thing in
quotes (so that characters like & are not interpreted by the shell) and
then escape the quotes that you want to actually be handled by curl:

curl
"http://localhost:8983/solr/nvd-rss/select?wt=json&indent=true&q=summary
:\"Oracle
Fusion\""

Thanks,
Shawn






Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-23 Thread Carl Roberts
I got the RSS DIH example to work with my own RSS feed and it works 
great - thanks for the help.


On 1/22/15, 11:20 AM, Carl Roberts wrote:

Thanks. I am looking at the RSS DIH example right now.


On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote:

Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts 
 wrote:

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several 
elements in
each node that I have to index, so I was planning to parse the XML 
with Stax
and extract the data from each node and add it to Solr.  There will 
always
be only one one file to start with and then a second file as the RSS 
feeds

supplies updates.  I want to return certain fields of each node when I
search certain fields of the same node.  Is Solr overkill in this case?
Should I just use Lucene instead?

Regards,

Joe






Is it possible to read multiple RSS feeds and XML Zip file feeds with DIH into one core?

2015-01-23 Thread Carl Roberts

Hi,

I have the RSS DIH example working with my own RSS feed - here is the 
configuration for it.





https://nvd.nist.gov/download/nvd-rss.xml";
processor="XPathEntityProcessor"
forEach="/RDF/item"
transformer="DateFormatTransformer">

commonField="true" />
commonField="true" />
commonField="true" />
commonField="true" />






However, my problem is that I also have to load multiple XML feeds into 
the same core.  Here is one example (there are about 10 of them):


http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2014.xml.zip


Is there any built-in functionality that would allow me to do this? 
Basically, the use-case is to load and index all the XML ZIP files 
first, and then check the RSS feed every two hours and update the 
indexes with any new ones.


Regards,

Joe




Re: Is it possible to read multiple RSS feeds and XML Zip file feeds with DIH into one core?

2015-01-23 Thread Carl Roberts

Hi Alex,

If I am understanding this correctly, I can define multiple entities 
like this?






...


How would I trigger loading certain entities during start?

How would I trigger loading other entities during update?

Is there a way to set an auto-update for certain entities so that I 
don't have to invoke an update via curl?


Where / how do I specify the preImportDeleteQuery to avoid deleting 
everything upon each update?


Is there an example or doc that shows how to do all this?

Regards,

Joe

On 1/23/15, 11:24 AM, Alexandre Rafalovitch wrote:

You can define both multiple entities in the same file and nested
entities if your list comes from an external source (e.g. a text file
of URLs).
You can also trigger DIH with a name of a specific entity to load just that.
You can even pass DIH configuration file when you are triggering the
processing start, so you can have different files completely for
initial load and update. Though you can just do the same with
entities.

The only thing to be aware of is that before an entity definition is
processed, a delete command is run. By default, it's "delete all", so
executing one entity will delete everything but then just populate
that one entity's results. You can avoid that by defining
preImportDeleteQuery and having a clear identifier on content
generated by each entity (e.g. source, either extracted or manually
added with TemplateTransformer).

Regards,
Alex.


Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 January 2015 at 11:15, Carl Roberts  wrote:

Hi,

I have the RSS DIH example working with my own RSS feed - here is the
configuration for it.


 
 
 https://nvd.nist.gov/download/nvd-rss.xml";
 processor="XPathEntityProcessor"
 forEach="/RDF/item"
 transformer="DateFormatTransformer">

 
 
 
 

 
 


However, my problem is that I also have to load multiple XML feeds into the
same core.  Here is one example (there are about 10 of them):

http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2014.xml.zip


Is there any built-in functionality that would allow me to do this?
Basically, the use-case is to load and index all the XML ZIP files first,
and then check the RSS feed every two hours and update the indexes with any
new ones.

Regards,

Joe






Sporadic Socket Timeout Error during Import

2015-01-23 Thread Carl Roberts

Hi,

I am using the DIH RSS example and I am running into a sporadic socket 
timeout error during every 3rd or 4th request. Below is the stack trace. 
What is the default socket timeout for reads and how can I increase it?



15046 [Thread-17] ERROR org.apache.solr.handler.dataimport.URLDataSource 
– Exception thrown while getting data

java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
at sun.security.ssl.InputRecord.read(InputRecord.java:480)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:884)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:98)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:42)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:283)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:224)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:204)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
815049 [Thread-17] ERROR org.apache.solr.handler.dataimport.DocBuilder – 
Exception while processing: nvd-rss document : SolrInputDocument(fields: 
[]):org.apache.solr.handler.dataimport.DataImportHandlerException: 
Exception in invoking url https://nvd.nist.gov/download/nvd-rss.xml 
Processing Document # 1
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:115)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:42)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:283)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:224)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:204)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)

Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
at sun.security.ssl.InputRecord.read(InputRecord.java:480)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:884)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at 
sun.net.www.protocol.http.HttpURLConnection.g

Re: Is it possible to read multiple RSS feeds and XML Zip file feeds with DIH into one core?

2015-01-23 Thread Carl Roberts

OK - Thanks for the doc.

Is it possible to just provide an empty value to preImportDeleteQuery to 
disable the delete prior to import?


Will the data still be deleted for each entity during a delta-import 
instead of full-import?


Is there any capability in the handler to unzip an XML file from a URL 
prior to reading it or can I perhaps hook a custom pre-processing handler?


Regards,

Joe


On 1/23/15, 1:40 PM, Alexandre Rafalovitch wrote:

https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

Admin UI has the interface, so you can play there once you define it.

You do have to use Curl, there is no built-in scheduler.

Regards,
Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 January 2015 at 13:29, Carl Roberts  wrote:

Hi Alex,

If I am understanding this correctly, I can define multiple entities like
this?


 
 
 
 ...


How would I trigger loading certain entities during start?

How would I trigger loading other entities during update?

Is there a way to set an auto-update for certain entities so that I don't
have to invoke an update via curl?

Where / how do I specify the preImportDeleteQuery to avoid deleting
everything upon each update?

Is there an example or doc that shows how to do all this?

Regards,

Joe


On 1/23/15, 11:24 AM, Alexandre Rafalovitch wrote:

You can define both multiple entities in the same file and nested
entities if your list comes from an external source (e.g. a text file
of URLs).
You can also trigger DIH with a name of a specific entity to load just
that.
You can even pass DIH configuration file when you are triggering the
processing start, so you can have different files completely for
initial load and update. Though you can just do the same with
entities.

The only thing to be aware of is that before an entity definition is
processed, a delete command is run. By default, it's "delete all", so
executing one entity will delete everything but then just populate
that one entity's results. You can avoid that by defining
preImportDeleteQuery and having a clear identifier on content
generated by each entity (e.g. source, either extracted or manually
added with TemplateTransformer).

Regards,
 Alex.


Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 January 2015 at 11:15, Carl Roberts 
wrote:

Hi,

I have the RSS DIH example working with my own RSS feed - here is the
configuration for it.


  
  
  https://nvd.nist.gov/download/nvd-rss.xml";
  processor="XPathEntityProcessor"
  forEach="/RDF/item"
  transformer="DateFormatTransformer">

  
  
  
  

  
  


However, my problem is that I also have to load multiple XML feeds into
the
same core.  Here is one example (there are about 10 of them):

http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2014.xml.zip


Is there any built-in functionality that would allow me to do this?
Basically, the use-case is to load and index all the XML ZIP files first,
and then check the RSS feed every two hours and update the indexes with
any
new ones.

Regards,

Joe






Re: Is it possible to read multiple RSS feeds and XML Zip file feeds with DIH into one core?

2015-01-23 Thread Carl Roberts
Excellent - thanks Shalin.  But how does delta-import work?  Does it do 
a clean also?  Does it require a unique Id?  Does it update existing 
records and only add when necessary?


And, how would I go about unzipping the content from a URL to then 
import the unzipped XML?  Is the recommended way to extend the 
URLDataSource class or is there any built-in logic to plug in 
pre-processing handlers?



And,
On 1/23/15, 2:39 PM, Shalin Shekhar Mangar wrote:

If you add clean=false as a parameter to the full-import then deletion is
disabled. Since you are ingesting RSS there is no need for deletion at all
I guess.

On Fri, Jan 23, 2015 at 7:31 PM, Carl Roberts 
wrote:
OK - Thanks for the doc.

Is it possible to just provide an empty value to preImportDeleteQuery to
disable the delete prior to import?

Will the data still be deleted for each entity during a delta-import
instead of full-import?

Is there any capability in the handler to unzip an XML file from a URL
prior to reading it or can I perhaps hook a custom pre-processing handler?

Regards,

Joe



On 1/23/15, 1:40 PM, Alexandre Rafalovitch wrote:


https://cwiki.apache.org/confluence/display/solr/
Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

Admin UI has the interface, so you can play there once you define it.

You do have to use Curl, there is no built-in scheduler.

Regards,
 Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 January 2015 at 13:29, Carl Roberts 
wrote:


Hi Alex,

If I am understanding this correctly, I can define multiple entities like
this?


  
  
  
  ...


How would I trigger loading certain entities during start?

How would I trigger loading other entities during update?

Is there a way to set an auto-update for certain entities so that I don't
have to invoke an update via curl?

Where / how do I specify the preImportDeleteQuery to avoid deleting
everything upon each update?

Is there an example or doc that shows how to do all this?

Regards,

Joe


On 1/23/15, 11:24 AM, Alexandre Rafalovitch wrote:


You can define both multiple entities in the same file and nested
entities if your list comes from an external source (e.g. a text file
of URLs).
You can also trigger DIH with a name of a specific entity to load just
that.
You can even pass DIH configuration file when you are triggering the
processing start, so you can have different files completely for
initial load and update. Though you can just do the same with
entities.

The only thing to be aware of is that before an entity definition is
processed, a delete command is run. By default, it's "delete all", so
executing one entity will delete everything but then just populate
that one entity's results. You can avoid that by defining
preImportDeleteQuery and having a clear identifier on content
generated by each entity (e.g. source, either extracted or manually
added with TemplateTransformer).

Regards,
  Alex.


Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 January 2015 at 11:15, Carl Roberts <
carl.roberts.zap...@gmail.com>
wrote:


Hi,

I have the RSS DIH example working with my own RSS feed - here is the
configuration for it.


   
   
   https://nvd.nist.gov/download/nvd-rss.xml";
   processor="XPathEntityProcessor"
   forEach="/RDF/item"
   transformer="DateFormatTransformer">

   
   
   
   

   
   


However, my problem is that I also have to load multiple XML feeds into
the
same core.  Here is one example (there are about 10 of them):

http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2014.xml.zip


Is there any built-in functionality that would allow me to do this?
Basically, the use-case is to load and index all the XML ZIP files
first,
and then check the RSS feed every two hours and update the indexes with
any
new ones.

Regards,

Joe









Fwd: Need Help with custom ZIPURLDataSource class

2015-01-23 Thread Carl Roberts


Hi,

I created a custom ZIPURLDataSource class to unzip the content from an
http URL for an XML ZIP file and it seems to be working (at least I have
no errors), but no data is imported.

Here is my configuration in rss-data-config.xml:




https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
processor="XPathEntityProcessor"
forEach="/nvd/entry"
transformer="DateFormatTransformer">













Attached is the ZIPURLDataSource.java file.

It actually unzips and saves the raw XML to disk, which I have verified to be a 
valid XML file.  The file has one or more entries (here is an example):

http://scap.nist.gov/schema/scap-core/0.1";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:patch="http://scap.nist.gov/schema/patch/0.1";
xmlns:vuln="http://scap.nist.gov/schema/vulnerability/0.4";
xmlns:cvss="http://scap.nist.gov/schema/cvss-v2/0.2";
xmlns:cpe-lang="http://cpe.mitre.org/language/2.0";
xmlns="http://scap.nist.gov/schema/feed/vulnerability/2.0";
pub_date="2015-01-10T05:37:05"
xsi:schemaLocation="http://scap.nist.gov/schema/patch/0.1
http://nvd.nist.gov/schema/patch_0.1.xsd
http://scap.nist.gov/schema/scap-core/0.1
http://nvd.nist.gov/schema/scap-core_0.1.xsd
http://scap.nist.gov/schema/feed/vulnerability/2.0
http://nvd.nist.gov/schema/nvd-cve-feed_2.0.xsd"; nvd_xml_version="2.0">

http://nvd.nist.gov/";>



























cpe:/o:freebsd:freebsd:2.2.8
cpe:/o:freebsd:freebsd:1.1.5.1
cpe:/o:freebsd:freebsd:2.2.3
cpe:/o:freebsd:freebsd:2.2.2
cpe:/o:freebsd:freebsd:2.2.5
cpe:/o:freebsd:freebsd:2.2.4
cpe:/o:freebsd:freebsd:2.0.5
cpe:/o:freebsd:freebsd:2.2.6
cpe:/o:freebsd:freebsd:2.1.6.1
cpe:/o:freebsd:freebsd:2.0.1
cpe:/o:freebsd:freebsd:2.2
cpe:/o:freebsd:freebsd:2.0
cpe:/o:openbsd:openbsd:2.3
cpe:/o:freebsd:freebsd:3.0
cpe:/o:freebsd:freebsd:1.1
cpe:/o:freebsd:freebsd:2.1.6
cpe:/o:openbsd:openbsd:2.4
cpe:/o:bsdi:bsd_os:3.1
cpe:/o:freebsd:freebsd:1.0
cpe:/o:freebsd:freebsd:2.1.7
cpe:/o:freebsd:freebsd:1.2
cpe:/o:freebsd:freebsd:2.1.5
cpe:/o:freebsd:freebsd:2.1.7.1

CVE-1999-0001
1999-12-30T00:00:00.000-05:00
2010-12-16T00:00:00.000-05:00


5.0
NETWORK
LOW
NONE
NONE
NONE
PARTIAL
http://nvd.nist.gov
2004-01-01T00:00:00.000-05:00




OSVDB
http://www.osvdb.org/5707";
xml:lang="en">5707


CONFIRM
http://www.openbsd.org/errata23.html#tcpfix";
xml:lang="en">http://www.openbsd.org/errata23.html#tcpfix

ip_input.c in BSD-derived TCP/IP implementations allows
remote attackers to cause a denial of service (crash or hang) via
crafted packets.



Here is the curl command:

curl http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import

And here is the output from the console for Jetty:

main{StandardDirectoryReader(segments_1:1:nrt)}
2407 [coreLoadExecutor-5-thread-1] INFO
org.apache.solr.core.CoreContainer – registering core: nvd-rss
2409 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
user.dir=/Users/carlroberts/dev/solr-4.10.3/example
2409 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
SolrDispatchFilter.init() done
2431 [main] INFO org.eclipse.jetty.server.AbstractConnector – Started
SocketConnector@0.0.0.0:8983
2450 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
[nvd-rss] webapp=null path=null
params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false}
hits=0 status=0 QTime=43
2451 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
QuerySenderListener done.
2451 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SpellCheckComponent – Loading spell
index for spellchecker: default
2451 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SpellCheckComponent – Loading spell
index for spellchecker: wordbreak
2452 [searcherExecutor-6-thread-1] INFO
org.apache.solr.handler.component.SuggestComponent – Loading suggester
index for: mySuggester
2452 [searcherExecutor-6-thread-1] INFO
org.apache.solr.spelling.suggest.SolrSuggester – reload()
2452 [searcherExecutor-6-thread-1] INFO
org.apache.solr.spelling.suggest.SolrSuggester – build()
2459 [searcherExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore –
[nvd-rss] Registered new searcher Searcher@df9e84e[nvd-rss]
main{StandardDirectoryReader(segments_1:1:nrt)}
8371 [qtp1640586218-17] INFO
org.apache.solr.handler.dataimport.DataImporter – Loading DIH
Configuration: rss-data-config.xml
8379 [qtp1640586218-17] INFO
org.apache.solr.handler.dataimport.DataImporter – Data Configuration
loaded successfully
8383 [Thread-15] INFO org.apache.solr.handler.dataimport.DataImporter –
Starting Full Import
8384 [qtp1640586218-17] INFO org.apache.solr.core.SolrCore – [nvd-rss]
webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=15
8396 [Thread-15] INFO
org.apache.solr.handler.dataimport.SimplePropertiesWriter – Read
dataimport.properties
23431 [commitScheduler-8-thread-1] INFO
org.apache.solr.update.UpdateHandler – start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommi