Thank you so much, Matt. You have answered my question.

Thanks
Aruna

From: Matt Burgess [mailto:[email protected]]
Sent: Monday, December 11, 2017 7:26 PM
To: [email protected]
Subject: Re: ListS3 Processor Error

Aruna,

The index and type for Elasticsearch are kinds of partitioning that can help 
the users organize data, but definitely help in indexing and searching data. 
Types are not always required, but an index is. Imagine you are trying to store 
a bunch of tweets from a Twitter feed (or firehose) into Elasticsearch. You 
could call the index "twitter" and type "tweet" for each tweet that you store 
in the twitter index. Now say you want to also put Twitter user information 
into that index. You can reuse "twitter" as the index but then specify "user" 
as the type. Now you can search the entire index for information in tweets and 
user data, or you can additionally search by type, perhaps searching only the 
documents with user type.

In the REST API, the index/type is specified such as GET /twitter/tweet/1 or 
GET /twitter/user/2 or something like that. The Elasticsearch processors use 
the index and type information to determine the right call to make to 
Elasticsearch.

You can certainly choose "pdf" as the type if you like, although depending on 
the sort of queries you'll be running, you may want to pick an index that 
incorporates any kind of data you'll be keeping together, and a type that is 
more domain-specific (such as "customer" if it is a PDF full of customer data). 
 Please let me know if that answers your question, I can provide more 
information if need be.

Regards,
Matt


On Mon, Dec 11, 2017 at 4:18 PM, Joe Witt 
<[email protected]<mailto:[email protected]>> wrote:
For that we'll need someone familiar with that processor/Elastic to chime in :)

Thanks

On Mon, Dec 11, 2017 at 4:16 PM, Aruna Sankaralingam 
<[email protected]<mailto:[email protected]>>
 wrote:
Oops I overlooked the question on version that you asked. My apologies. I am 
using Nifi v1.4.

I moved the pdf file to another folder in the same S3 bucket and Nifi was able 
to pick up.

Initially it was in
S3 > part-d-prescription-drug/unstructured
I moved to
S3 > Nifi-Pecos-files

I still don’t know what was wrong with the old location. But for now, I am 
using the one that works.

I am trying to put this pdf file in elasticsearch.

I am not sure what I should give for “Index” and “Type”. Should the type be 
“PDF” ?

Thanks
Aruna

From: Joe Witt [mailto:[email protected]<mailto:[email protected]>]
Sent: Monday, December 11, 2017 3:32 PM

To: [email protected]<mailto:[email protected]>
Subject: Re: ListS3 Processor Error

Aruna,

We'll need to know more about your config/env to help I think.  I am not aware 
of any normal usage situation that should result in truncated responses.  It is 
possible it is a coding bug we can resolve but I think we'll need more details. 
 Did you see the questions in my last reply?

Thanks

On Mon, Dec 11, 2017 at 2:50 PM, Aruna Sankaralingam 
<[email protected]<mailto:[email protected]>>
 wrote:
Could someone please let me know what is wrong with the configuration that it 
is failing?

From: Aruna Sankaralingam 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Monday, December 11, 2017 1:07 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: ListS3 Processor Error

Attached my nifi-app.log. Could you please let me know what went wrong?

From: Joe Witt [mailto:[email protected]]
Sent: Friday, December 08, 2017 4:04 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: ListS3 Processor Error

Here is an example I found for another processor

  
https://mail-archives.apache.org/mod_mbox/nifi-dev/201509.mbox/%3CCAFddr26AEVqnoQ=mWr7DSNDFVrr9NuYy9GCcXg=4fyycqab...@mail.gmail.com%3E

Thanks

On Fri, Dec 8, 2017 at 4:02 PM, Aruna Sankaralingam 
<[email protected]<mailto:[email protected]>>
 wrote:
Joe,
Could you please let me know how to turn on the debug logging?

From: Joe Witt [mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, December 08, 2017 3:59 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: ListS3 Processor Error

What version of NiFi?

Looks like either a classpath/classloader issue OR the amazon client library 
cannot parse the response it is getting back...

The logs/nifi-app.log should have the full stack trace.  If not you can turn on 
debug logging for that processor and perhaps then it will.

Thanks

On Fri, Dec 8, 2017 at 3:56 PM, Aruna Sankaralingam 
<[email protected]<mailto:[email protected]>>
 wrote:
I am trying to get a pdf file from S3 and load to Elastic Search. The ListS3 
processor is giving me this error. Could someone please let me know where I am 
going wrong?

20:52:25 UTC
ERROR
37d7226e-0160-1000-6049-d4c489cd32f3
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] 
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process session due 
to com.amazonaws.SdkClientException: Failed to parse XML document with handler 
class 
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
 Failed to parse XML document with handler class 
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
20:52:25 UTC
WARNING
37d7226e-0160-1000-6049-d4c489cd32f3
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] Processor Administratively 
Yielded for 1 sec due to processing failure
20:52:26 UTC
ERROR
37d7226e-0160-1000-6049-d4c489cd32f3
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] 
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process due to 
com.amazonaws.SdkClientException: Failed to parse XML document with handler 
class 
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler;
 rolling back session: Failed to parse XML document with handler class 
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
20:52:26 UTC
ERROR
37d7226e-0160-1000-6049-d4c489cd32f3
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] 
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process session due 
to com.amazonaws.SdkClientException: Failed to parse XML document with handler 
class 
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
 Failed to parse XML document with handler class 
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
20:52:26 UTC
WARNING
37d7226e-0160-1000-6049-d4c489cd32f3
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] Processor Administratively 
Yielded for 1 sec due to processing failure
Auto-refresh

[cid:[email protected]]





Reply via email to