Re: End of stream?

2015-11-06 Thread Matt Burgess
Sounds very promising, thank you!! I'll share what I find out :)

Are there other group-related use cases? Maybe some non-incremental statistical 
measures?

Regards,
Matt

Sent from my iPhone

> On Nov 6, 2015, at 9:48 PM, Michael Moser  wrote:
> 
> Matt,
> 
> There is the MonitorActivity processor, which "Monitors the flow for
> activity and sends out an indicator when the flow has not had any data for
> some specified amount of time and again when the flow's activity is
> restored".  You could look at how MonitorActivity is coded to get ideas for
> how your ReservoirSampling processor can do what you need.
> 
> -- Mike
> 
> 
> On Fri, Nov 6, 2015 at 11:49 AM, Matthew Burgess 
> wrote:
> 
>> No that makes sense, thanks much!
>> 
>> So for my case, I'm thinking I'd want another attribute from GetFile called
>> "lastInStream" or something? It would be set once processing of the current
>> directory is complete (for the time being), and reset each time the
>> onTrigger is called.  At that point it's really more of a "lastInBatch", so
>> maybe instead I could use the batch size somehow as a hint to the
>> ReservoirSampling processor that the current reservoir is ready to send
>> along?  The use case is a kind of burst processing (or per-batch
>> filtering),
>> where FlowFiles are available in "groups", where I could sample from the
>> incoming group with equal probability to give a smaller output group.
>> 
>> 
>> From:  Joe Witt 
>> Reply-To:  
>> Date:  Friday, November 6, 2015 at 11:38 AM
>> To:  
>> Subject:  Re: End of stream?
>> 
>> Matt,
>> 
>> For processors in the middle of the flow the null check is important
>> for race conditions where it is told it can run but by the time it
>> does there are no flowfiles left.  The framework though in general
>> will avoid this because it is checking if there is work to do.  So, in
>> short you can't use that mechanism to know there are no items left to
>> process.
>> 
>> The only way to know that a given flowfile was the last in a bunch
>> would be for that fact to be an attribute on a given flow file.
>> 
>> There is really no concept of an end of stream so to speak from a
>> processor perspective.  Processors are either running on not running.
>> You can, as i mentioned before though, use attributes of flowfiles to
>> annotate their relative position in a stream.
>> 
>> Does that help explain it at all or did I make it more confusing?
>> 
>> Thanks
>> Joe
>> 
>> On Fri, Nov 6, 2015 at 11:32 AM, Matthew Burgess 
>> wrote:
>>> Does NiFi have the concept of an "end of stream" or is it designed to
>> pretty
>>> much always be running? For example if I use a GetFile processor
>> pointing at
>>> a single directory (with remove files = true), once all the files have
>> been
>>> processed, can downstream processors know that?
>>> 
>>> I'm working on a ReservoirSampling processor, and I have it successfully
>>> building the reservoir from all incoming FlowFiles. However it never
>> gets to
>>> the logic that sends the sampled FlowFiles to the downstream processor
>> (just
>>> a PutFile at this point). I have the logic in a block like:
>>> 
>>> FlowFile flowFile = session.get();
>>> if(flowFile == null) {
>>>   // send reservoir
>>> }
>>> else {
>>>  // build reservoir
>>> }
>>> 
>>> But the if-clause never gets entered.  Is there a different approach
>> and/or
>>> am I misunderstanding how the data flow works?
>>> 
>>> Thanks in advance,
>>> Matt
>> 
>> 
>> 
>> 


Re: End of stream?

2015-11-06 Thread Michael Moser
Matt,

There is the MonitorActivity processor, which "Monitors the flow for
activity and sends out an indicator when the flow has not had any data for
some specified amount of time and again when the flow's activity is
restored".  You could look at how MonitorActivity is coded to get ideas for
how your ReservoirSampling processor can do what you need.

-- Mike


On Fri, Nov 6, 2015 at 11:49 AM, Matthew Burgess 
wrote:

> No that makes sense, thanks much!
>
> So for my case, I'm thinking I'd want another attribute from GetFile called
> "lastInStream" or something? It would be set once processing of the current
> directory is complete (for the time being), and reset each time the
> onTrigger is called.  At that point it's really more of a "lastInBatch", so
> maybe instead I could use the batch size somehow as a hint to the
> ReservoirSampling processor that the current reservoir is ready to send
> along?  The use case is a kind of burst processing (or per-batch
> filtering),
> where FlowFiles are available in "groups", where I could sample from the
> incoming group with equal probability to give a smaller output group.
>
>
> From:  Joe Witt 
> Reply-To:  
> Date:  Friday, November 6, 2015 at 11:38 AM
> To:  
> Subject:  Re: End of stream?
>
> Matt,
>
> For processors in the middle of the flow the null check is important
> for race conditions where it is told it can run but by the time it
> does there are no flowfiles left.  The framework though in general
> will avoid this because it is checking if there is work to do.  So, in
> short you can't use that mechanism to know there are no items left to
> process.
>
> The only way to know that a given flowfile was the last in a bunch
> would be for that fact to be an attribute on a given flow file.
>
> There is really no concept of an end of stream so to speak from a
> processor perspective.  Processors are either running on not running.
> You can, as i mentioned before though, use attributes of flowfiles to
> annotate their relative position in a stream.
>
> Does that help explain it at all or did I make it more confusing?
>
> Thanks
> Joe
>
> On Fri, Nov 6, 2015 at 11:32 AM, Matthew Burgess 
> wrote:
> >  Does NiFi have the concept of an "end of stream" or is it designed to
> pretty
> >  much always be running? For example if I use a GetFile processor
> pointing at
> >  a single directory (with remove files = true), once all the files have
> been
> >  processed, can downstream processors know that?
> >
> >  I'm working on a ReservoirSampling processor, and I have it successfully
> >  building the reservoir from all incoming FlowFiles. However it never
> gets to
> >  the logic that sends the sampled FlowFiles to the downstream processor
> (just
> >  a PutFile at this point). I have the logic in a block like:
> >
> >  FlowFile flowFile = session.get();
> >  if(flowFile == null) {
> >// send reservoir
> >  }
> >  else {
> >   // build reservoir
> >  }
> >
> >  But the if-clause never gets entered.  Is there a different approach
> and/or
> >  am I misunderstanding how the data flow works?
> >
> >  Thanks in advance,
> >  Matt
> >
> >
>
>
>
>


Re: End of stream?

2015-11-06 Thread Joe Witt
Also meant to reply back on this earlier...

It would be a reasonable JIRA to add logic into GetFile to add an
attribute to GetFile to signal that a given flow file was sourced by
the 'last file left' in a given directory or source.  However, it is
somewhat odd in that when is something considered the last?  Also of
note here is that data could be prioritized post GetFile and then
you'd really not know if you're dealing with the last one or the first
one or anything in between.  We'd really need GetFile to put a
timestamp and sequence id on or something.  Hmmm.

Given what you're trying to do could instead this logic of sample
groups around some time interval simply be part of that processor?

Thanks
Joe

On Fri, Nov 6, 2015 at 9:48 PM, Michael Moser  wrote:
> Matt,
>
> There is the MonitorActivity processor, which "Monitors the flow for
> activity and sends out an indicator when the flow has not had any data for
> some specified amount of time and again when the flow's activity is
> restored".  You could look at how MonitorActivity is coded to get ideas for
> how your ReservoirSampling processor can do what you need.
>
> -- Mike
>
>
> On Fri, Nov 6, 2015 at 11:49 AM, Matthew Burgess 
> wrote:
>
>> No that makes sense, thanks much!
>>
>> So for my case, I'm thinking I'd want another attribute from GetFile called
>> "lastInStream" or something? It would be set once processing of the current
>> directory is complete (for the time being), and reset each time the
>> onTrigger is called.  At that point it's really more of a "lastInBatch", so
>> maybe instead I could use the batch size somehow as a hint to the
>> ReservoirSampling processor that the current reservoir is ready to send
>> along?  The use case is a kind of burst processing (or per-batch
>> filtering),
>> where FlowFiles are available in "groups", where I could sample from the
>> incoming group with equal probability to give a smaller output group.
>>
>>
>> From:  Joe Witt 
>> Reply-To:  
>> Date:  Friday, November 6, 2015 at 11:38 AM
>> To:  
>> Subject:  Re: End of stream?
>>
>> Matt,
>>
>> For processors in the middle of the flow the null check is important
>> for race conditions where it is told it can run but by the time it
>> does there are no flowfiles left.  The framework though in general
>> will avoid this because it is checking if there is work to do.  So, in
>> short you can't use that mechanism to know there are no items left to
>> process.
>>
>> The only way to know that a given flowfile was the last in a bunch
>> would be for that fact to be an attribute on a given flow file.
>>
>> There is really no concept of an end of stream so to speak from a
>> processor perspective.  Processors are either running on not running.
>> You can, as i mentioned before though, use attributes of flowfiles to
>> annotate their relative position in a stream.
>>
>> Does that help explain it at all or did I make it more confusing?
>>
>> Thanks
>> Joe
>>
>> On Fri, Nov 6, 2015 at 11:32 AM, Matthew Burgess 
>> wrote:
>> >  Does NiFi have the concept of an "end of stream" or is it designed to
>> pretty
>> >  much always be running? For example if I use a GetFile processor
>> pointing at
>> >  a single directory (with remove files = true), once all the files have
>> been
>> >  processed, can downstream processors know that?
>> >
>> >  I'm working on a ReservoirSampling processor, and I have it successfully
>> >  building the reservoir from all incoming FlowFiles. However it never
>> gets to
>> >  the logic that sends the sampled FlowFiles to the downstream processor
>> (just
>> >  a PutFile at this point). I have the logic in a block like:
>> >
>> >  FlowFile flowFile = session.get();
>> >  if(flowFile == null) {
>> >// send reservoir
>> >  }
>> >  else {
>> >   // build reservoir
>> >  }
>> >
>> >  But the if-clause never gets entered.  Is there a different approach
>> and/or
>> >  am I misunderstanding how the data flow works?
>> >
>> >  Thanks in advance,
>> >  Matt
>> >
>> >
>>
>>
>>
>>


Push to 0.4.0 "planning"

2015-11-06 Thread Tony Kurc
All,
For those not on the commits mailing list, in a push for 0.4.0 you may have
seen a flurry of changing fix versions and comments about readiness for
0.4.0 to assignees. Joe wrote an email about a target date a bit ago, and
so Joe or I made some reasonable guesses about whether the ticket would be
ready for 0.4.0 or maybe could wait. Please, if you think this was done in
error, please re-add it.

What is still unresolved for 0.4.0? I used the following JQL, but you may
have a favorite Jira-fu that you prefer:

project = NIFI and fixVersion = 0.4.0 and resolution = Unresolved order by
updatedDate asc

Many of these have patches, so committers, reviews may be apropos!

I may start going through unresolved tickets with no fix versions looking
for important bugs that slipped through the cracks tomorrow. I encourage
others to do so also.

project = NIFI and fixVersion is EMPTY and resolution = Unresolved order by
updatedDate desc


Tony


Re: Push to 0.4.0 "planning"

2015-11-06 Thread Joe Witt
adding to what Tony said I've just bumped the version in master to
0.4.0-SNAPSHOT as appropriate to the range of commits.  We have 22
tickets remaining and fortunately nearly all have patches needing
review so that is great.  If we can really focus on knocking these
existing ones down and put a concerted effort on testing a range of
cases that should give us a good view into release readiness.  As soon
as the tickets for 0.4.0 are done i'll kick out an 0.4.0 RC1.

Thanks
Joe

On Fri, Nov 6, 2015 at 10:52 PM, Tony Kurc  wrote:
> All,
> For those not on the commits mailing list, in a push for 0.4.0 you may have
> seen a flurry of changing fix versions and comments about readiness for
> 0.4.0 to assignees. Joe wrote an email about a target date a bit ago, and
> so Joe or I made some reasonable guesses about whether the ticket would be
> ready for 0.4.0 or maybe could wait. Please, if you think this was done in
> error, please re-add it.
>
> What is still unresolved for 0.4.0? I used the following JQL, but you may
> have a favorite Jira-fu that you prefer:
>
> project = NIFI and fixVersion = 0.4.0 and resolution = Unresolved order by
> updatedDate asc
>
> Many of these have patches, so committers, reviews may be apropos!
>
> I may start going through unresolved tickets with no fix versions looking
> for important bugs that slipped through the cracks tomorrow. I encourage
> others to do so also.
>
> project = NIFI and fixVersion is EMPTY and resolution = Unresolved order by
> updatedDate desc
>
>
> Tony


Re: JSON / Avro issues

2015-11-06 Thread Tony Kurc
Presuming it is off a recent commit, you should be able to read a delimited
tab file using "\t" as the delimiter. There should be a dropdown that will
allow you to choose ARRAY or NONE as a JSON container option, which would
toggle between the two JSON representations you described.

On Thu, Nov 5, 2015 at 10:24 PM, Cerulean Blue  wrote:

> I'm using a snapshot built yesterday.
>
> Thanks
>
>
> Sent from my iPhone
>
> > On Nov 5, 2015, at 4:19 PM, Bryan Bende  wrote:
> >
> > Jeff,
> >
> > Are you using the 0.3.0 release?
> >
> > I think this is the issue you ran into which is resolved for the next
> release:
> > https://issues.apache.org/jira/browse/NIFI-944
> >
> > With regards to ConvertJSONtoAvro, I believe it one json document per
> line with a new line at the end of each line (your second example).
> >
> > -Bryan
> >
> >> On Thu, Nov 5, 2015 at 4:59 PM, Jeff  wrote:
> >> I built a simple flow that reads a tab separated file and attempts to
> convert to Avro.
> >>
> >> ConvertCSVtoAvro just says that the conversion failed.
> >>
> >> Where can I find more information on what the failure was?
> >>
> >> Using the same sample tab separated file, I create a JSON file out of
> it.
> >>
> >> The JSON to Avro processor also fails with very little explication.
> >>
> >>
> >> With regard to the ConvertCSVtoAvro processor
> >> Since my file is tab  delimited, do I simple open the "CSV
> delimiter” property, delete , and hit the tab key or is there a special
> syntax like ^t?
> >> My data has no CSV quote character so do I leave this as “or
> delete it or check the empty box?
> >>
> >> With regard to the ConvertJSONtoAvro
> >> What is the expected JSON source file to look like?
> >> [
> >>  {fields values … },
> >>  {fields values …}
> >> ]
> >> Or
> >>  {fields values … }
> >>  {fields values …}
> >> or something else.
> >>
> >> Thanks,
> >>
> >> Sorry for send this to 2 lists
> >
>


[GitHub] nifi pull request: NIFI-1099 fixed the handling of InterruptedExce...

2015-11-06 Thread olegz
Github user olegz commented on the pull request:

https://github.com/apache/nifi/pull/115#issuecomment-154428329
  
I just want to make sure that we are all on the same page; _repeat the 
interrupt_ in the context of _Thread.interrupt()_ simply implies communication 
of something that have already happened and is not he same as re-throwing 
exception. Hence my point about that being safe. 

For cases where one really wants to ignore the interrupt especially where 
```Thread.sleep(..)``` is used (a whole other topic), we can simply use 
```LockSupport.parkNanos(..)```.  Any interrupt will not turn into exception 
making user responsible to query the active thread periodically and check if it 
has been interrupted ```Thread.isInterrupted()```.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Incorporation of other Maven repositories

2015-11-06 Thread Joe Percivall
As no issues were brought up, I'm going to assume that everyone is ok with 
adding Bintray JCenter as a repo. I plan on using it in a patch for 0.4.0 in 
which I'm refactoring InvokeHttp. The patch is dependent on a lib to add digest 
authentication that is only hosted there.

Thanks,
Joe
- - - - - - 
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com




On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess  
wrote:
Bintray JCenter (https://bintray.com/bintray/jcenter/) is also moderated and
claims to be "the repository with the biggest collection of Maven artifacts
in the world". I think Bintray itself proxies out to Maven Central, but it
appears that for JCenter you choose to sync your artifacts with Maven
Central: http://blog.bintray.com/tag/maven-central/

I imagine trust is still a per-organization or per-artifact issue, but
Bintray claims to be even safer and more trustworthy than Maven Central
(source: 
http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).  For
my (current) work and home projects, I still resolve from Maven Central, but
I have been publishing my own artifacts to Bintray.

Regards,
Matt

From:  Aldrin Piri 
Reply-To:  
Date:  Tuesday, November 3, 2015 at 12:34 PM
To:  
Subject:  Incorporation of other Maven repositories


I am writing to see what the general guidance and posture is on
incorporating additional repositories into the build process.

Obviously, Maven Central provides a very known quantity.  Are there other
repositories that are viewed with the same level of trust?  If so, is there
a listing? If not, do we vet new sources as they bring libraries that aid
our project and how is this accomplished?

Incorporating other repos brings up additional areas of concern,
specifically availability but also some additional security considerations
to the binaries that are being retrieved.

Any thoughts on this front would be much appreciated.


Re: Incorporation of other Maven repositories

2015-11-06 Thread Joe Witt
Joe

Sorry i didn't catch this thread sooner.  I am not supportive of
adding a required repo if it means we need to tell folks to update
their maven settings.  While it sounds trivial it really isn't.  We
should seek to understand better what other projects do for such
things.  Definitely no fast movement on this one please.

Thanks
Joe

On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
 wrote:
> As no issues were brought up, I'm going to assume that everyone is ok with 
> adding Bintray JCenter as a repo. I plan on using it in a patch for 0.4.0 in 
> which I'm refactoring InvokeHttp. The patch is dependent on a lib to add 
> digest authentication that is only hosted there.
>
> Thanks,
> Joe
> - - - - - -
> Joseph Percivall
> linkedin.com/in/Percivall
> e: joeperciv...@yahoo.com
>
>
>
>
> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess  
> wrote:
> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also moderated and
> claims to be "the repository with the biggest collection of Maven artifacts
> in the world". I think Bintray itself proxies out to Maven Central, but it
> appears that for JCenter you choose to sync your artifacts with Maven
> Central: http://blog.bintray.com/tag/maven-central/
>
> I imagine trust is still a per-organization or per-artifact issue, but
> Bintray claims to be even safer and more trustworthy than Maven Central
> (source:
> http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).  For
> my (current) work and home projects, I still resolve from Maven Central, but
> I have been publishing my own artifacts to Bintray.
>
> Regards,
> Matt
>
> From:  Aldrin Piri 
> Reply-To:  
> Date:  Tuesday, November 3, 2015 at 12:34 PM
> To:  
> Subject:  Incorporation of other Maven repositories
>
>
> I am writing to see what the general guidance and posture is on
> incorporating additional repositories into the build process.
>
> Obviously, Maven Central provides a very known quantity.  Are there other
> repositories that are viewed with the same level of trust?  If so, is there
> a listing? If not, do we vet new sources as they bring libraries that aid
> our project and how is this accomplished?
>
> Incorporating other repos brings up additional areas of concern,
> specifically availability but also some additional security considerations
> to the binaries that are being retrieved.
>
> Any thoughts on this front would be much appreciated.


Re: Incorporation of other Maven repositories

2015-11-06 Thread Joe Witt
joe explained to me he meant to update the nifi pom.xml with this
repository.  Today we use whatever the apache pom (which we extend
from uses) which for releases is nothing which means it is whatever
maven defaults to (presumably maven central).  So we see that spark
does this explicit addition of repositories on their pom for both
primary artifacts and plugins.

My concern with this is that our requirement as a community is to
provide repeatable builds.  We looked into what Hbase and Spark do and
in fact both of them extend their poms to depend on other repos as
well so there is precedent.

In light of finding other apache projects that use extra repositories
and the fact that Jcenter Bintray while being a commercially focused
repo is offering free support for OSS artifacts then I think the risk
is low.  I am ok with this.

Anyone have a different view?

Thanks
Joe

On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt  wrote:
> Joe
>
> Sorry i didn't catch this thread sooner.  I am not supportive of
> adding a required repo if it means we need to tell folks to update
> their maven settings.  While it sounds trivial it really isn't.  We
> should seek to understand better what other projects do for such
> things.  Definitely no fast movement on this one please.
>
> Thanks
> Joe
>
> On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
>  wrote:
>> As no issues were brought up, I'm going to assume that everyone is ok with 
>> adding Bintray JCenter as a repo. I plan on using it in a patch for 0.4.0 in 
>> which I'm refactoring InvokeHttp. The patch is dependent on a lib to add 
>> digest authentication that is only hosted there.
>>
>> Thanks,
>> Joe
>> - - - - - -
>> Joseph Percivall
>> linkedin.com/in/Percivall
>> e: joeperciv...@yahoo.com
>>
>>
>>
>>
>> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess  
>> wrote:
>> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also moderated and
>> claims to be "the repository with the biggest collection of Maven artifacts
>> in the world". I think Bintray itself proxies out to Maven Central, but it
>> appears that for JCenter you choose to sync your artifacts with Maven
>> Central: http://blog.bintray.com/tag/maven-central/
>>
>> I imagine trust is still a per-organization or per-artifact issue, but
>> Bintray claims to be even safer and more trustworthy than Maven Central
>> (source:
>> http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).  For
>> my (current) work and home projects, I still resolve from Maven Central, but
>> I have been publishing my own artifacts to Bintray.
>>
>> Regards,
>> Matt
>>
>> From:  Aldrin Piri 
>> Reply-To:  
>> Date:  Tuesday, November 3, 2015 at 12:34 PM
>> To:  
>> Subject:  Incorporation of other Maven repositories
>>
>>
>> I am writing to see what the general guidance and posture is on
>> incorporating additional repositories into the build process.
>>
>> Obviously, Maven Central provides a very known quantity.  Are there other
>> repositories that are viewed with the same level of trust?  If so, is there
>> a listing? If not, do we vet new sources as they bring libraries that aid
>> our project and how is this accomplished?
>>
>> Incorporating other repos brings up additional areas of concern,
>> specifically availability but also some additional security considerations
>> to the binaries that are being retrieved.
>>
>> Any thoughts on this front would be much appreciated.


End of stream?

2015-11-06 Thread Matthew Burgess
Does NiFi have the concept of an "end of stream" or is it designed to pretty
much always be running? For example if I use a GetFile processor pointing at
a single directory (with remove files = true), once all the files have been
processed, can downstream processors know that?

I'm working on a ReservoirSampling processor, and I have it successfully
building the reservoir from all incoming FlowFiles. However it never gets to
the logic that sends the sampled FlowFiles to the downstream processor (just
a PutFile at this point). I have the logic in a block like:

FlowFile flowFile = session.get();
if(flowFile == null) {
  // send reservoir
}
else {
 // build reservoir
}

But the if-clause never gets entered.  Is there a different approach and/or
am I misunderstanding how the data flow works?

Thanks in advance,
Matt




Re: End of stream?

2015-11-06 Thread Matthew Burgess
No that makes sense, thanks much!

So for my case, I'm thinking I'd want another attribute from GetFile called
"lastInStream" or something? It would be set once processing of the current
directory is complete (for the time being), and reset each time the
onTrigger is called.  At that point it's really more of a "lastInBatch", so
maybe instead I could use the batch size somehow as a hint to the
ReservoirSampling processor that the current reservoir is ready to send
along?  The use case is a kind of burst processing (or per-batch filtering),
where FlowFiles are available in "groups", where I could sample from the
incoming group with equal probability to give a smaller output group.


From:  Joe Witt 
Reply-To:  
Date:  Friday, November 6, 2015 at 11:38 AM
To:  
Subject:  Re: End of stream?

Matt,

For processors in the middle of the flow the null check is important
for race conditions where it is told it can run but by the time it
does there are no flowfiles left.  The framework though in general
will avoid this because it is checking if there is work to do.  So, in
short you can't use that mechanism to know there are no items left to
process.

The only way to know that a given flowfile was the last in a bunch
would be for that fact to be an attribute on a given flow file.

There is really no concept of an end of stream so to speak from a
processor perspective.  Processors are either running on not running.
You can, as i mentioned before though, use attributes of flowfiles to
annotate their relative position in a stream.

Does that help explain it at all or did I make it more confusing?

Thanks
Joe

On Fri, Nov 6, 2015 at 11:32 AM, Matthew Burgess 
wrote:
>  Does NiFi have the concept of an "end of stream" or is it designed to pretty
>  much always be running? For example if I use a GetFile processor pointing at
>  a single directory (with remove files = true), once all the files have been
>  processed, can downstream processors know that?
> 
>  I'm working on a ReservoirSampling processor, and I have it successfully
>  building the reservoir from all incoming FlowFiles. However it never gets to
>  the logic that sends the sampled FlowFiles to the downstream processor (just
>  a PutFile at this point). I have the logic in a block like:
> 
>  FlowFile flowFile = session.get();
>  if(flowFile == null) {
>// send reservoir
>  }
>  else {
>   // build reservoir
>  }
> 
>  But the if-clause never gets entered.  Is there a different approach and/or
>  am I misunderstanding how the data flow works?
> 
>  Thanks in advance,
>  Matt
> 
> 





Re: Incorporation of other Maven repositories

2015-11-06 Thread Adam Taft
I'm concerned that not all networks will be able to connect with and use
the JCenter repository.  If it's not in Maven Central, we should likely
avoid the dependency and instead find alternative approaches.

Adam



On Fri, Nov 6, 2015 at 11:31 AM, Joe Witt  wrote:

> joe explained to me he meant to update the nifi pom.xml with this
> repository.  Today we use whatever the apache pom (which we extend
> from uses) which for releases is nothing which means it is whatever
> maven defaults to (presumably maven central).  So we see that spark
> does this explicit addition of repositories on their pom for both
> primary artifacts and plugins.
>
> My concern with this is that our requirement as a community is to
> provide repeatable builds.  We looked into what Hbase and Spark do and
> in fact both of them extend their poms to depend on other repos as
> well so there is precedent.
>
> In light of finding other apache projects that use extra repositories
> and the fact that Jcenter Bintray while being a commercially focused
> repo is offering free support for OSS artifacts then I think the risk
> is low.  I am ok with this.
>
> Anyone have a different view?
>
> Thanks
> Joe
>
> On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt  wrote:
> > Joe
> >
> > Sorry i didn't catch this thread sooner.  I am not supportive of
> > adding a required repo if it means we need to tell folks to update
> > their maven settings.  While it sounds trivial it really isn't.  We
> > should seek to understand better what other projects do for such
> > things.  Definitely no fast movement on this one please.
> >
> > Thanks
> > Joe
> >
> > On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
> >  wrote:
> >> As no issues were brought up, I'm going to assume that everyone is ok
> with adding Bintray JCenter as a repo. I plan on using it in a patch for
> 0.4.0 in which I'm refactoring InvokeHttp. The patch is dependent on a lib
> to add digest authentication that is only hosted there.
> >>
> >> Thanks,
> >> Joe
> >> - - - - - -
> >> Joseph Percivall
> >> linkedin.com/in/Percivall
> >> e: joeperciv...@yahoo.com
> >>
> >>
> >>
> >>
> >> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess <
> mattyb...@gmail.com> wrote:
> >> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also
> moderated and
> >> claims to be "the repository with the biggest collection of Maven
> artifacts
> >> in the world". I think Bintray itself proxies out to Maven Central, but
> it
> >> appears that for JCenter you choose to sync your artifacts with Maven
> >> Central: http://blog.bintray.com/tag/maven-central/
> >>
> >> I imagine trust is still a per-organization or per-artifact issue, but
> >> Bintray claims to be even safer and more trustworthy than Maven Central
> >> (source:
> >> http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).
> For
> >> my (current) work and home projects, I still resolve from Maven
> Central, but
> >> I have been publishing my own artifacts to Bintray.
> >>
> >> Regards,
> >> Matt
> >>
> >> From:  Aldrin Piri 
> >> Reply-To:  
> >> Date:  Tuesday, November 3, 2015 at 12:34 PM
> >> To:  
> >> Subject:  Incorporation of other Maven repositories
> >>
> >>
> >> I am writing to see what the general guidance and posture is on
> >> incorporating additional repositories into the build process.
> >>
> >> Obviously, Maven Central provides a very known quantity.  Are there
> other
> >> repositories that are viewed with the same level of trust?  If so, is
> there
> >> a listing? If not, do we vet new sources as they bring libraries that
> aid
> >> our project and how is this accomplished?
> >>
> >> Incorporating other repos brings up additional areas of concern,
> >> specifically availability but also some additional security
> considerations
> >> to the binaries that are being retrieved.
> >>
> >> Any thoughts on this front would be much appreciated.
>


Re: Incorporation of other Maven repositories

2015-11-06 Thread Joe Witt
What are some examples of networks which can access maven central but
cannot access JCenter?

Thanks
Joe

On Fri, Nov 6, 2015 at 12:10 PM, Adam Taft  wrote:
> I'm concerned that not all networks will be able to connect with and use
> the JCenter repository.  If it's not in Maven Central, we should likely
> avoid the dependency and instead find alternative approaches.
>
> Adam
>
>
>
> On Fri, Nov 6, 2015 at 11:31 AM, Joe Witt  wrote:
>
>> joe explained to me he meant to update the nifi pom.xml with this
>> repository.  Today we use whatever the apache pom (which we extend
>> from uses) which for releases is nothing which means it is whatever
>> maven defaults to (presumably maven central).  So we see that spark
>> does this explicit addition of repositories on their pom for both
>> primary artifacts and plugins.
>>
>> My concern with this is that our requirement as a community is to
>> provide repeatable builds.  We looked into what Hbase and Spark do and
>> in fact both of them extend their poms to depend on other repos as
>> well so there is precedent.
>>
>> In light of finding other apache projects that use extra repositories
>> and the fact that Jcenter Bintray while being a commercially focused
>> repo is offering free support for OSS artifacts then I think the risk
>> is low.  I am ok with this.
>>
>> Anyone have a different view?
>>
>> Thanks
>> Joe
>>
>> On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt  wrote:
>> > Joe
>> >
>> > Sorry i didn't catch this thread sooner.  I am not supportive of
>> > adding a required repo if it means we need to tell folks to update
>> > their maven settings.  While it sounds trivial it really isn't.  We
>> > should seek to understand better what other projects do for such
>> > things.  Definitely no fast movement on this one please.
>> >
>> > Thanks
>> > Joe
>> >
>> > On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
>> >  wrote:
>> >> As no issues were brought up, I'm going to assume that everyone is ok
>> with adding Bintray JCenter as a repo. I plan on using it in a patch for
>> 0.4.0 in which I'm refactoring InvokeHttp. The patch is dependent on a lib
>> to add digest authentication that is only hosted there.
>> >>
>> >> Thanks,
>> >> Joe
>> >> - - - - - -
>> >> Joseph Percivall
>> >> linkedin.com/in/Percivall
>> >> e: joeperciv...@yahoo.com
>> >>
>> >>
>> >>
>> >>
>> >> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess <
>> mattyb...@gmail.com> wrote:
>> >> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also
>> moderated and
>> >> claims to be "the repository with the biggest collection of Maven
>> artifacts
>> >> in the world". I think Bintray itself proxies out to Maven Central, but
>> it
>> >> appears that for JCenter you choose to sync your artifacts with Maven
>> >> Central: http://blog.bintray.com/tag/maven-central/
>> >>
>> >> I imagine trust is still a per-organization or per-artifact issue, but
>> >> Bintray claims to be even safer and more trustworthy than Maven Central
>> >> (source:
>> >> http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).
>> For
>> >> my (current) work and home projects, I still resolve from Maven
>> Central, but
>> >> I have been publishing my own artifacts to Bintray.
>> >>
>> >> Regards,
>> >> Matt
>> >>
>> >> From:  Aldrin Piri 
>> >> Reply-To:  
>> >> Date:  Tuesday, November 3, 2015 at 12:34 PM
>> >> To:  
>> >> Subject:  Incorporation of other Maven repositories
>> >>
>> >>
>> >> I am writing to see what the general guidance and posture is on
>> >> incorporating additional repositories into the build process.
>> >>
>> >> Obviously, Maven Central provides a very known quantity.  Are there
>> other
>> >> repositories that are viewed with the same level of trust?  If so, is
>> there
>> >> a listing? If not, do we vet new sources as they bring libraries that
>> aid
>> >> our project and how is this accomplished?
>> >>
>> >> Incorporating other repos brings up additional areas of concern,
>> >> specifically availability but also some additional security
>> considerations
>> >> to the binaries that are being retrieved.
>> >>
>> >> Any thoughts on this front would be much appreciated.
>>


Re: Incorporation of other Maven repositories

2015-11-06 Thread Joe Witt
As an additional data point Hadoop does this as well.  So Hadoop,
Spark, and HBase easily three of the most widely built open source
projects around do this.

Thanks
Joe

On Fri, Nov 6, 2015 at 1:01 PM, Joe Witt  wrote:
> What are some examples of networks which can access maven central but
> cannot access JCenter?
>
> Thanks
> Joe
>
> On Fri, Nov 6, 2015 at 12:10 PM, Adam Taft  wrote:
>> I'm concerned that not all networks will be able to connect with and use
>> the JCenter repository.  If it's not in Maven Central, we should likely
>> avoid the dependency and instead find alternative approaches.
>>
>> Adam
>>
>>
>>
>> On Fri, Nov 6, 2015 at 11:31 AM, Joe Witt  wrote:
>>
>>> joe explained to me he meant to update the nifi pom.xml with this
>>> repository.  Today we use whatever the apache pom (which we extend
>>> from uses) which for releases is nothing which means it is whatever
>>> maven defaults to (presumably maven central).  So we see that spark
>>> does this explicit addition of repositories on their pom for both
>>> primary artifacts and plugins.
>>>
>>> My concern with this is that our requirement as a community is to
>>> provide repeatable builds.  We looked into what Hbase and Spark do and
>>> in fact both of them extend their poms to depend on other repos as
>>> well so there is precedent.
>>>
>>> In light of finding other apache projects that use extra repositories
>>> and the fact that Jcenter Bintray while being a commercially focused
>>> repo is offering free support for OSS artifacts then I think the risk
>>> is low.  I am ok with this.
>>>
>>> Anyone have a different view?
>>>
>>> Thanks
>>> Joe
>>>
>>> On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt  wrote:
>>> > Joe
>>> >
>>> > Sorry i didn't catch this thread sooner.  I am not supportive of
>>> > adding a required repo if it means we need to tell folks to update
>>> > their maven settings.  While it sounds trivial it really isn't.  We
>>> > should seek to understand better what other projects do for such
>>> > things.  Definitely no fast movement on this one please.
>>> >
>>> > Thanks
>>> > Joe
>>> >
>>> > On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
>>> >  wrote:
>>> >> As no issues were brought up, I'm going to assume that everyone is ok
>>> with adding Bintray JCenter as a repo. I plan on using it in a patch for
>>> 0.4.0 in which I'm refactoring InvokeHttp. The patch is dependent on a lib
>>> to add digest authentication that is only hosted there.
>>> >>
>>> >> Thanks,
>>> >> Joe
>>> >> - - - - - -
>>> >> Joseph Percivall
>>> >> linkedin.com/in/Percivall
>>> >> e: joeperciv...@yahoo.com
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess <
>>> mattyb...@gmail.com> wrote:
>>> >> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also
>>> moderated and
>>> >> claims to be "the repository with the biggest collection of Maven
>>> artifacts
>>> >> in the world". I think Bintray itself proxies out to Maven Central, but
>>> it
>>> >> appears that for JCenter you choose to sync your artifacts with Maven
>>> >> Central: http://blog.bintray.com/tag/maven-central/
>>> >>
>>> >> I imagine trust is still a per-organization or per-artifact issue, but
>>> >> Bintray claims to be even safer and more trustworthy than Maven Central
>>> >> (source:
>>> >> http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).
>>> For
>>> >> my (current) work and home projects, I still resolve from Maven
>>> Central, but
>>> >> I have been publishing my own artifacts to Bintray.
>>> >>
>>> >> Regards,
>>> >> Matt
>>> >>
>>> >> From:  Aldrin Piri 
>>> >> Reply-To:  
>>> >> Date:  Tuesday, November 3, 2015 at 12:34 PM
>>> >> To:  
>>> >> Subject:  Incorporation of other Maven repositories
>>> >>
>>> >>
>>> >> I am writing to see what the general guidance and posture is on
>>> >> incorporating additional repositories into the build process.
>>> >>
>>> >> Obviously, Maven Central provides a very known quantity.  Are there
>>> other
>>> >> repositories that are viewed with the same level of trust?  If so, is
>>> there
>>> >> a listing? If not, do we vet new sources as they bring libraries that
>>> aid
>>> >> our project and how is this accomplished?
>>> >>
>>> >> Incorporating other repos brings up additional areas of concern,
>>> >> specifically availability but also some additional security
>>> considerations
>>> >> to the binaries that are being retrieved.
>>> >>
>>> >> Any thoughts on this front would be much appreciated.
>>>


Re: Incorporation of other Maven repositories

2015-11-06 Thread Tony Kurc
As we're providing source code, the repositories section in the pom are
more a "convenient pointer" than a "thou shalt use". Building using a
different repository of your choosing is as simple as adding a mirror in
your maven settings.

Because of this, I'm not even close to having an objection.

On Fri, Nov 6, 2015 at 1:03 PM, Joe Witt  wrote:

> As an additional data point Hadoop does this as well.  So Hadoop,
> Spark, and HBase easily three of the most widely built open source
> projects around do this.
>
> Thanks
> Joe
>
> On Fri, Nov 6, 2015 at 1:01 PM, Joe Witt  wrote:
> > What are some examples of networks which can access maven central but
> > cannot access JCenter?
> >
> > Thanks
> > Joe
> >
> > On Fri, Nov 6, 2015 at 12:10 PM, Adam Taft  wrote:
> >> I'm concerned that not all networks will be able to connect with and use
> >> the JCenter repository.  If it's not in Maven Central, we should likely
> >> avoid the dependency and instead find alternative approaches.
> >>
> >> Adam
> >>
> >>
> >>
> >> On Fri, Nov 6, 2015 at 11:31 AM, Joe Witt  wrote:
> >>
> >>> joe explained to me he meant to update the nifi pom.xml with this
> >>> repository.  Today we use whatever the apache pom (which we extend
> >>> from uses) which for releases is nothing which means it is whatever
> >>> maven defaults to (presumably maven central).  So we see that spark
> >>> does this explicit addition of repositories on their pom for both
> >>> primary artifacts and plugins.
> >>>
> >>> My concern with this is that our requirement as a community is to
> >>> provide repeatable builds.  We looked into what Hbase and Spark do and
> >>> in fact both of them extend their poms to depend on other repos as
> >>> well so there is precedent.
> >>>
> >>> In light of finding other apache projects that use extra repositories
> >>> and the fact that Jcenter Bintray while being a commercially focused
> >>> repo is offering free support for OSS artifacts then I think the risk
> >>> is low.  I am ok with this.
> >>>
> >>> Anyone have a different view?
> >>>
> >>> Thanks
> >>> Joe
> >>>
> >>> On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt  wrote:
> >>> > Joe
> >>> >
> >>> > Sorry i didn't catch this thread sooner.  I am not supportive of
> >>> > adding a required repo if it means we need to tell folks to update
> >>> > their maven settings.  While it sounds trivial it really isn't.  We
> >>> > should seek to understand better what other projects do for such
> >>> > things.  Definitely no fast movement on this one please.
> >>> >
> >>> > Thanks
> >>> > Joe
> >>> >
> >>> > On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
> >>> >  wrote:
> >>> >> As no issues were brought up, I'm going to assume that everyone is
> ok
> >>> with adding Bintray JCenter as a repo. I plan on using it in a patch
> for
> >>> 0.4.0 in which I'm refactoring InvokeHttp. The patch is dependent on a
> lib
> >>> to add digest authentication that is only hosted there.
> >>> >>
> >>> >> Thanks,
> >>> >> Joe
> >>> >> - - - - - -
> >>> >> Joseph Percivall
> >>> >> linkedin.com/in/Percivall
> >>> >> e: joeperciv...@yahoo.com
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess <
> >>> mattyb...@gmail.com> wrote:
> >>> >> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also
> >>> moderated and
> >>> >> claims to be "the repository with the biggest collection of Maven
> >>> artifacts
> >>> >> in the world". I think Bintray itself proxies out to Maven Central,
> but
> >>> it
> >>> >> appears that for JCenter you choose to sync your artifacts with
> Maven
> >>> >> Central: http://blog.bintray.com/tag/maven-central/
> >>> >>
> >>> >> I imagine trust is still a per-organization or per-artifact issue,
> but
> >>> >> Bintray claims to be even safer and more trustworthy than Maven
> Central
> >>> >> (source:
> >>> >>
> http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).
> >>> For
> >>> >> my (current) work and home projects, I still resolve from Maven
> >>> Central, but
> >>> >> I have been publishing my own artifacts to Bintray.
> >>> >>
> >>> >> Regards,
> >>> >> Matt
> >>> >>
> >>> >> From:  Aldrin Piri 
> >>> >> Reply-To:  
> >>> >> Date:  Tuesday, November 3, 2015 at 12:34 PM
> >>> >> To:  
> >>> >> Subject:  Incorporation of other Maven repositories
> >>> >>
> >>> >>
> >>> >> I am writing to see what the general guidance and posture is on
> >>> >> incorporating additional repositories into the build process.
> >>> >>
> >>> >> Obviously, Maven Central provides a very known quantity.  Are there
> >>> other
> >>> >> repositories that are viewed with the same level of trust?  If so,
> is
> >>> there
> >>> >> a listing? If not, do we vet new sources as they bring libraries
> that
> >>> aid
> >>> >> our project and how is this 

Re: Incorporation of other Maven repositories

2015-11-06 Thread Adam Taft
I'm OK with this if trkurc is OK with this.  He's far wiser than I on most
everything.  ;)



On Fri, Nov 6, 2015 at 1:11 PM, Tony Kurc  wrote:

> As we're providing source code, the repositories section in the pom are
> more a "convenient pointer" than a "thou shalt use". Building using a
> different repository of your choosing is as simple as adding a mirror in
> your maven settings.
>
> Because of this, I'm not even close to having an objection.
>
> On Fri, Nov 6, 2015 at 1:03 PM, Joe Witt  wrote:
>
> > As an additional data point Hadoop does this as well.  So Hadoop,
> > Spark, and HBase easily three of the most widely built open source
> > projects around do this.
> >
> > Thanks
> > Joe
> >
> > On Fri, Nov 6, 2015 at 1:01 PM, Joe Witt  wrote:
> > > What are some examples of networks which can access maven central but
> > > cannot access JCenter?
> > >
> > > Thanks
> > > Joe
> > >
> > > On Fri, Nov 6, 2015 at 12:10 PM, Adam Taft  wrote:
> > >> I'm concerned that not all networks will be able to connect with and
> use
> > >> the JCenter repository.  If it's not in Maven Central, we should
> likely
> > >> avoid the dependency and instead find alternative approaches.
> > >>
> > >> Adam
> > >>
> > >>
> > >>
> > >> On Fri, Nov 6, 2015 at 11:31 AM, Joe Witt  wrote:
> > >>
> > >>> joe explained to me he meant to update the nifi pom.xml with this
> > >>> repository.  Today we use whatever the apache pom (which we extend
> > >>> from uses) which for releases is nothing which means it is whatever
> > >>> maven defaults to (presumably maven central).  So we see that spark
> > >>> does this explicit addition of repositories on their pom for both
> > >>> primary artifacts and plugins.
> > >>>
> > >>> My concern with this is that our requirement as a community is to
> > >>> provide repeatable builds.  We looked into what Hbase and Spark do
> and
> > >>> in fact both of them extend their poms to depend on other repos as
> > >>> well so there is precedent.
> > >>>
> > >>> In light of finding other apache projects that use extra repositories
> > >>> and the fact that Jcenter Bintray while being a commercially focused
> > >>> repo is offering free support for OSS artifacts then I think the risk
> > >>> is low.  I am ok with this.
> > >>>
> > >>> Anyone have a different view?
> > >>>
> > >>> Thanks
> > >>> Joe
> > >>>
> > >>> On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt 
> wrote:
> > >>> > Joe
> > >>> >
> > >>> > Sorry i didn't catch this thread sooner.  I am not supportive of
> > >>> > adding a required repo if it means we need to tell folks to update
> > >>> > their maven settings.  While it sounds trivial it really isn't.  We
> > >>> > should seek to understand better what other projects do for such
> > >>> > things.  Definitely no fast movement on this one please.
> > >>> >
> > >>> > Thanks
> > >>> > Joe
> > >>> >
> > >>> > On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
> > >>> >  wrote:
> > >>> >> As no issues were brought up, I'm going to assume that everyone is
> > ok
> > >>> with adding Bintray JCenter as a repo. I plan on using it in a patch
> > for
> > >>> 0.4.0 in which I'm refactoring InvokeHttp. The patch is dependent on
> a
> > lib
> > >>> to add digest authentication that is only hosted there.
> > >>> >>
> > >>> >> Thanks,
> > >>> >> Joe
> > >>> >> - - - - - -
> > >>> >> Joseph Percivall
> > >>> >> linkedin.com/in/Percivall
> > >>> >> e: joeperciv...@yahoo.com
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess <
> > >>> mattyb...@gmail.com> wrote:
> > >>> >> Bintray JCenter (https://bintray.com/bintray/jcenter/) is also
> > >>> moderated and
> > >>> >> claims to be "the repository with the biggest collection of Maven
> > >>> artifacts
> > >>> >> in the world". I think Bintray itself proxies out to Maven
> Central,
> > but
> > >>> it
> > >>> >> appears that for JCenter you choose to sync your artifacts with
> > Maven
> > >>> >> Central: http://blog.bintray.com/tag/maven-central/
> > >>> >>
> > >>> >> I imagine trust is still a per-organization or per-artifact issue,
> > but
> > >>> >> Bintray claims to be even safer and more trustworthy than Maven
> > Central
> > >>> >> (source:
> > >>> >>
> > http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).
> > >>> For
> > >>> >> my (current) work and home projects, I still resolve from Maven
> > >>> Central, but
> > >>> >> I have been publishing my own artifacts to Bintray.
> > >>> >>
> > >>> >> Regards,
> > >>> >> Matt
> > >>> >>
> > >>> >> From:  Aldrin Piri 
> > >>> >> Reply-To:  
> > >>> >> Date:  Tuesday, November 3, 2015 at 12:34 PM
> > >>> >> To:  
> > >>> >> Subject:  Incorporation of other Maven repositories
> > >>> >>
> > >>> >>
> > >>> >> I am writing to see what the general 

Re: Incorporation of other Maven repositories

2015-11-06 Thread Jean-Baptiste Onofré

Hi guys,

sorry, I'm back on the project after some busy weeks ;)

I agree with Tony: for convenience, having multiple Maven repos in the 
pom.xml is not a big deal.


Just my $0.01

Regards
JB

On 11/06/2015 07:11 PM, Tony Kurc wrote:

As we're providing source code, the repositories section in the pom are
more a "convenient pointer" than a "thou shalt use". Building using a
different repository of your choosing is as simple as adding a mirror in
your maven settings.

Because of this, I'm not even close to having an objection.

On Fri, Nov 6, 2015 at 1:03 PM, Joe Witt  wrote:


As an additional data point Hadoop does this as well.  So Hadoop,
Spark, and HBase easily three of the most widely built open source
projects around do this.

Thanks
Joe

On Fri, Nov 6, 2015 at 1:01 PM, Joe Witt  wrote:

What are some examples of networks which can access maven central but
cannot access JCenter?

Thanks
Joe

On Fri, Nov 6, 2015 at 12:10 PM, Adam Taft  wrote:

I'm concerned that not all networks will be able to connect with and use
the JCenter repository.  If it's not in Maven Central, we should likely
avoid the dependency and instead find alternative approaches.

Adam



On Fri, Nov 6, 2015 at 11:31 AM, Joe Witt  wrote:


joe explained to me he meant to update the nifi pom.xml with this
repository.  Today we use whatever the apache pom (which we extend
from uses) which for releases is nothing which means it is whatever
maven defaults to (presumably maven central).  So we see that spark
does this explicit addition of repositories on their pom for both
primary artifacts and plugins.

My concern with this is that our requirement as a community is to
provide repeatable builds.  We looked into what Hbase and Spark do and
in fact both of them extend their poms to depend on other repos as
well so there is precedent.

In light of finding other apache projects that use extra repositories
and the fact that Jcenter Bintray while being a commercially focused
repo is offering free support for OSS artifacts then I think the risk
is low.  I am ok with this.

Anyone have a different view?

Thanks
Joe

On Fri, Nov 6, 2015 at 11:04 AM, Joe Witt  wrote:

Joe

Sorry i didn't catch this thread sooner.  I am not supportive of
adding a required repo if it means we need to tell folks to update
their maven settings.  While it sounds trivial it really isn't.  We
should seek to understand better what other projects do for such
things.  Definitely no fast movement on this one please.

Thanks
Joe

On Fri, Nov 6, 2015 at 10:18 AM, Joe Percivall
 wrote:

As no issues were brought up, I'm going to assume that everyone is

ok

with adding Bintray JCenter as a repo. I plan on using it in a patch

for

0.4.0 in which I'm refactoring InvokeHttp. The patch is dependent on a

lib

to add digest authentication that is only hosted there.


Thanks,
Joe
- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com




On Tuesday, November 3, 2015 4:52 PM, Matthew Burgess <

mattyb...@gmail.com> wrote:

Bintray JCenter (https://bintray.com/bintray/jcenter/) is also

moderated and

claims to be "the repository with the biggest collection of Maven

artifacts

in the world". I think Bintray itself proxies out to Maven Central,

but

it

appears that for JCenter you choose to sync your artifacts with

Maven

Central: http://blog.bintray.com/tag/maven-central/

I imagine trust is still a per-organization or per-artifact issue,

but

Bintray claims to be even safer and more trustworthy than Maven

Central

(source:


http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/).

For

my (current) work and home projects, I still resolve from Maven

Central, but

I have been publishing my own artifacts to Bintray.

Regards,
Matt

From:  Aldrin Piri 
Reply-To:  
Date:  Tuesday, November 3, 2015 at 12:34 PM
To:  
Subject:  Incorporation of other Maven repositories


I am writing to see what the general guidance and posture is on
incorporating additional repositories into the build process.

Obviously, Maven Central provides a very known quantity.  Are there

other

repositories that are viewed with the same level of trust?  If so,

is

there

a listing? If not, do we vet new sources as they bring libraries

that

aid

our project and how is this accomplished?

Incorporating other repos brings up additional areas of concern,
specifically availability but also some additional security

considerations

to the binaries that are being retrieved.

Any thoughts on this front would be much appreciated.








--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com