Nifi - DistributedMapCacheService - how many items is considered big

2020-01-14 Thread Christopher J. Amatulli
How many items within a distributed map cache service would be considered 
excessive? I have a situation where I was considering dropping in around 200 
million, but I was thinking where the limitation (wall or performance hit) 
exists within the service.

I was thinking about using the cache service as a temporary (key / map) store 
for the duration of the entire process, and when all processing completes push 
it all to (MySQL). When I looked at my key list, I noticed it was about 150 
million keys which would have a corresponding json value to be stored in the 
map.

That got me thinking... good idea or bad one? What do you think?









Re: Nifi - DistributedMapCacheService - how many items is considered big

2020-01-14 Thread Shawn Weeks
If your using an external one like HBase I wouldn’t expect there to be any 
issue assuming it had enough space. However if you are using the built in one 
aka DistributedMapCacheServer then all the values need to fit in memory. One 
thing I see an issue with is there isn’t a bulk way to get data back out of the 
cache as it only supports individual key value lookups.

It would help to understand your work flow a bit more.

Thanks
Shawn

From: "Christopher J. Amatulli" 
Reply-To: "users@nifi.apache.org" 
Date: Tuesday, January 14, 2020 at 2:47 PM
To: "users@nifi.apache.org" 
Subject: Nifi - DistributedMapCacheService - how many items is considered big

How many items within a distributed map cache service would be considered 
excessive? I have a situation where I was considering dropping in around 200 
million, but I was thinking where the limitation (wall or performance hit) 
exists within the service.

I was thinking about using the cache service as a temporary (key / map) store 
for the duration of the entire process, and when all processing completes push 
it all to (MySQL). When I looked at my key list, I noticed it was about 150 
million keys which would have a corresponding json value to be stored in the 
map.

That got me thinking… good idea or bad one? What do you think?









RE: Nifi - DistributedMapCacheService - how many items is considered big

2020-01-14 Thread Christopher J. Amatulli
I think you answered my question. The DistributedMapCacheServer that comes with 
Nifi utilizes memory, as such, memory will be the constraint. If all I store in 
the map cache is a key + a small avro/json with 4 columns, I could probably fit 
millions without a problem.

I am going to play a little on this one. thanks


From: Shawn Weeks 
Sent: Tuesday, January 14, 2020 4:01 PM
To: users@nifi.apache.org
Subject: Re: Nifi - DistributedMapCacheService - how many items is considered 
big

If your using an external one like HBase I wouldn’t expect there to be any 
issue assuming it had enough space. However if you are using the built in one 
aka DistributedMapCacheServer then all the values need to fit in memory. One 
thing I see an issue with is there isn’t a bulk way to get data back out of the 
cache as it only supports individual key value lookups.

It would help to understand your work flow a bit more.

Thanks
Shawn

From: "Christopher J. Amatulli" 
mailto:camatu...@technicallycreative.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, January 14, 2020 at 2:47 PM
To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Nifi - DistributedMapCacheService - how many items is considered big

How many items within a distributed map cache service would be considered 
excessive? I have a situation where I was considering dropping in around 200 
million, but I was thinking where the limitation (wall or performance hit) 
exists within the service.

I was thinking about using the cache service as a temporary (key / map) store 
for the duration of the entire process, and when all processing completes push 
it all to (MySQL). When I looked at my key list, I noticed it was about 150 
million keys which would have a corresponding json value to be stored in the 
map.

That got me thinking… good idea or bad one? What do you think?









Re: Nifi - DistributedMapCacheService - how many items is considered big

2020-01-14 Thread Mike Thomsen
You can also use the Redis map cache implementation here as well.

On Tue, Jan 14, 2020 at 4:51 PM Christopher J. Amatulli <
camatu...@technicallycreative.com> wrote:

> I think you answered my question. The DistributedMapCacheServer that comes
> with Nifi utilizes memory, as such, memory will be the constraint. If all I
> store in the map cache is a key + a small avro/json with 4 columns, I could
> probably fit millions without a problem.
>
>
>
> I am going to play a little on this one. thanks
>
>
>
>
>
> *From:* Shawn Weeks 
> *Sent:* Tuesday, January 14, 2020 4:01 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Nifi - DistributedMapCacheService - how many items is
> considered big
>
>
>
> If your using an external one like HBase I wouldn’t expect there to be any
> issue assuming it had enough space. However if you are using the built in
> one aka DistributedMapCacheServer then all the values need to fit in
> memory. One thing I see an issue with is there isn’t a bulk way to get data
> back out of the cache as it only supports individual key value lookups.
>
>
>
> It would help to understand your work flow a bit more.
>
>
>
> Thanks
>
> Shawn
>
>
>
> *From: *"Christopher J. Amatulli" 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Tuesday, January 14, 2020 at 2:47 PM
> *To: *"users@nifi.apache.org" 
> *Subject: *Nifi - DistributedMapCacheService - how many items is
> considered big
>
>
>
> How many items within a distributed map cache service would be considered
> excessive? I have a situation where I was considering dropping in around
> 200 million, but I was thinking where the limitation (wall or performance
> hit) exists within the service.
>
>
>
> I was thinking about using the cache service as a temporary (key / map)
> store for the duration of the entire process, and when all processing
> completes push it all to (MySQL). When I looked at my key list, I noticed
> it was about 150 million keys which would have a corresponding json value
> to be stored in the map.
>
>
>
> That got me thinking… good idea or bad one? What do you think?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


DistributedMapCacheService - automated creation

2020-01-14 Thread William Gosse
Is there any way to automate the creation of a DistributedMapCacheService

I was hoping I could do this via a template but it doesn’t to get exported the 
way the  DistributedMapCacheClientService does.

The client got created just fine but there’s no service.

I don’t see away to do this with restful api either.

Any help would be greatly appreciated.




Re: DistributedMapCacheService - automated creation

2020-01-14 Thread Mike Thomsen
What is your use case? Because this sounds vaguely like something that you
could implement implicitly with the HBase version by just creating one big
cache table with an eye for not overlapping column qualifiers and possibly
liberal use of HBase TTLs.

On Tue, Jan 14, 2020 at 6:04 PM William Gosse 
wrote:

> Is there any way to automate the creation of a DistributedMapCacheService
>
>
>
> I was hoping I could do this via a template but it doesn’t to get exported
> the way the  DistributedMapCacheClientService does.
>
>
>
> The client got created just fine but there’s no service.
>
>
>
> I don’t see away to do this with restful api either.
>
>
>
> Any help would be greatly appreciated.
>
>
>
>
>
>


code page conversion

2020-01-14 Thread סמדג'ה גיא
Hi all
I have a code page conversion issue
I have a flow that reads records from mongoDB and inserts the result into oracle
The mongo encoding is utf-8 , and the oracle is iso8859-8
Some special characters are not shown correctly in oracle – shows ???
I tried using the convertcharacterset function before inserting into oracle, 
but with no luck (same ???)
And ideas?


Thanks in advanced

Guy Smadga
ETL infrastructure team leader
Israel discount bank




הודעה זו וכל מסמך הנלווה לה (להלן: "המידע"), הינו שירות דוא"ל מבנק דיסקונט. 
המידע מיועד אך ורק לנמען הרשום בה ולא לשום אדם או גוף זולתו ועשויים להכיל מידע 
סודי וחסוי. יש לעדכן מיידית אודות כל שינוי בכתובת הדואר האלקטרוני ו/או במספר 
טלפון הנייד, באתר הבנק, באפליקציה, בסניפך או בטלבנק דיסקונט 6111*. ניתן לבטל את 
השירות בכל עת, מכל סיבה שהיא. 

אם אינך הנמען הרשום בהודעה, אנא צור קשר עם טלבנק דיסקונט 6111*, כמו כן הנך 
מתבקש להימנע מלעשות כל שימוש, העתקה, פרסום, הפצה, גילוי או העברה של המידע או כל 
חלק ממנו, ולהשמיד את המידע וכל ההעתקים שלו.