Proposal: standard record metadata attributes for data sources

2018-04-13 Thread Mike Thomsen
I'd like to propose that all non-deprecated (or likely to be deprecated)
Get/Fetch/Query processors get a standard convention for attributes that
describe things like:

1. Source system.
2. Database/table/index/collection/etc.
3. The lookup criteria that was used (similar to the "query attribute" some
already have).

Using GetMongo as an example, it would add something like this:

source.url=mongodb://localhost:27017
source.database=testdb
source.collection=test_collection
source.query={ "username": "john.smith" }
source.criteria.username=john.smith //GetMongo would parse the query and
add this.

We have a use case where a team is coming from an extremely batch-oriented
view and really wants to know when "dataset X" was run. Our solution was to
extract that from the result set because the dataset name is one of the
fields in the JSON body.

I think this would help expand what you can do out of the box with
provenance tracking because it would provide a lot of useful information
that could be stored in Solr or ES and then queried against terminating
processors' DROP events to get a solid window into when jobs were run
historically.

Thoughts?


Re: [EXT] Suggestion: Apache NiFi component enhancement

2018-04-13 Thread Michael Moser
Hi Sivaprasanna,

After reading your first email, I thought it would be a lot of work for
little benefit, because I think you would have needed to touch many parts
of the framework.

After reading your clarification, the scope seems more limited.  It's
definitely an interesting idea, and here are my thoughts

   - When searching by component type, will I be able to find components by
   both class name and annotation name?
   - Will the flow.xml contain the class name or the annotation name or
   both?  If just one, might people look through the flow.xml for the other
   name and not find anything?
   - Would the auto-generated documentation contain both class name and
   annotation name?
   - Would this have any impact to the Registry?  Perhaps not yet, but does
   it fit with the desire to include component extensions in the Registry?
   - Would we have to modify the Logger messages to include the annotation
   name instead of the class name in the logs?

Thank you for your engagement with NiFi and for thinking of ways to improve
it!

-- Mike



On Fri, Apr 13, 2018 at 11:50 AM, Sivaprasanna 
wrote:

> Busy week, eh?
>
> Anybody is having any concerns, suggestions? Any input is appreciated :)
>
> Cheers,
> Sivaprasanna
>
> On Thu, Apr 12, 2018 at 10:14 PM, Sivaprasanna 
> wrote:
>
> > No my suggestion was a simpler approach. It just affects only the UI
> > aspect as my intention is to just override how the 'type' gets rendered
> in
> > the UI. For example, a processor's type is set to its canonical class
> name (
> > DtoFactory.java#createProcessorDto
> >  bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-
> web-api/src/main/java/org/apache/nifi/web/api/dto/DtoFactory.java#L2783>)
> > but rather than getting the canonical class name, let's just get from
> some
> > other method that checks if the new annotation is present, if it is
> > present, set the value provided in the annotation as the type, if it's
> not
> > present set the canonical class name just like how it is now. Again, my
> > intention is to just affect the UI so as to avoid making unnecessary
> > complication that could pose some backwards compatibility issue.
> >
> > -
> > Sivaprasanna
> >
> > On Thu, Apr 12, 2018 at 1:35 PM, Peter Wicks (pwicks)  >
> > wrote:
> >
> >> I think this is a good idea. But based on your example I think you would
> >> want to provide a primary Type along with a list of Alias types.
> >> If NiFi starts and it can no longer find a processor by the Type name it
> >> had in the flow.xml it can check he annotations/aliases to see if it's
> been
> >> renamed. This would allow for easy renames.
> >>
> >> Example 1: NiFi can no longer find AzureDocumentDBProcessor. Developer
> >> renamed it to CosmosDBProcessor. In this case we don't really want the
> type
> >> to still same "DocumentDB", that's just confusing. Also, we might not
> want
> >> the type named CosmosDBProcessor. So we make the Type be something nice,
> >> like "Azure Comos DB", then add Aliases for "AzureDocumentDBProcessor"
> and
> >> "CosmosDBProcessor".
> >>
> >> Next year when Microsoft renames it "CelestialDB" we can rename the
> >> processor and add another alias.
> >>
> >> Something like that?
> >>
> >> -Original Message-
> >> From: Sivaprasanna [mailto:sivaprasanna...@gmail.com]
> >> Sent: Wednesday, April 11, 2018 23:37
> >> To: dev@nifi.apache.org
> >> Subject: [EXT] Suggestion: Apache NiFi component enhancement
> >>
> >> All,
> >>
> >> Currently the "type" of a component is actually the component's
> canonical
> >> class name which gets rendered in the UI as the class name with the
> >> component version. This is good. However I'm thinking it is better to
> have
> >> an annotation which a developer can use to override the component type.
> >>
> >> How is it used?
> >> I think an annotation can be sufficient. The framework checks if the
> >> annotation is present or not, if it is present, it uses the name
> provided
> >> there or else it uses the class name like how it is happening.
> >>
> >> Why and where is it needed?
> >>
> >>- In scenarios where we devise a new naming convention and want to
> >> apply
> >>it to older components without breaking backward compatibility
> >>- A developer had created a component class with a name but later
> down
> >>the line, the developer or someone else wants to change it to
> something
> >>else, the reason could again be naming convention or just that the
> new
> >> name
> >>makes more sense
> >>- A component that has been built to work with third party tech, like
> >>Azure, MongoDB, S3, Druid processors but the later versions of that
> >> tech
> >>has been changed to something else by the original creators.
> (Something
> >>similar has happened to Azure's DocumentDB which got later rebranded
> as
> >>Azure CosmosDB). In such cases, without deprecating or rebuilding a
> new
> >>processor, thi

Re: [EXT] Suggestion: Apache NiFi component enhancement

2018-04-13 Thread Sivaprasanna
Busy week, eh?

Anybody is having any concerns, suggestions? Any input is appreciated :)

Cheers,
Sivaprasanna

On Thu, Apr 12, 2018 at 10:14 PM, Sivaprasanna 
wrote:

> No my suggestion was a simpler approach. It just affects only the UI
> aspect as my intention is to just override how the 'type' gets rendered in
> the UI. For example, a processor's type is set to its canonical class name (
> DtoFactory.java#createProcessorDto
> )
> but rather than getting the canonical class name, let's just get from some
> other method that checks if the new annotation is present, if it is
> present, set the value provided in the annotation as the type, if it's not
> present set the canonical class name just like how it is now. Again, my
> intention is to just affect the UI so as to avoid making unnecessary
> complication that could pose some backwards compatibility issue.
>
> -
> Sivaprasanna
>
> On Thu, Apr 12, 2018 at 1:35 PM, Peter Wicks (pwicks) 
> wrote:
>
>> I think this is a good idea. But based on your example I think you would
>> want to provide a primary Type along with a list of Alias types.
>> If NiFi starts and it can no longer find a processor by the Type name it
>> had in the flow.xml it can check he annotations/aliases to see if it's been
>> renamed. This would allow for easy renames.
>>
>> Example 1: NiFi can no longer find AzureDocumentDBProcessor. Developer
>> renamed it to CosmosDBProcessor. In this case we don't really want the type
>> to still same "DocumentDB", that's just confusing. Also, we might not want
>> the type named CosmosDBProcessor. So we make the Type be something nice,
>> like "Azure Comos DB", then add Aliases for "AzureDocumentDBProcessor" and
>> "CosmosDBProcessor".
>>
>> Next year when Microsoft renames it "CelestialDB" we can rename the
>> processor and add another alias.
>>
>> Something like that?
>>
>> -Original Message-
>> From: Sivaprasanna [mailto:sivaprasanna...@gmail.com]
>> Sent: Wednesday, April 11, 2018 23:37
>> To: dev@nifi.apache.org
>> Subject: [EXT] Suggestion: Apache NiFi component enhancement
>>
>> All,
>>
>> Currently the "type" of a component is actually the component's canonical
>> class name which gets rendered in the UI as the class name with the
>> component version. This is good. However I'm thinking it is better to have
>> an annotation which a developer can use to override the component type.
>>
>> How is it used?
>> I think an annotation can be sufficient. The framework checks if the
>> annotation is present or not, if it is present, it uses the name provided
>> there or else it uses the class name like how it is happening.
>>
>> Why and where is it needed?
>>
>>- In scenarios where we devise a new naming convention and want to
>> apply
>>it to older components without breaking backward compatibility
>>- A developer had created a component class with a name but later down
>>the line, the developer or someone else wants to change it to something
>>else, the reason could again be naming convention or just that the new
>> name
>>makes more sense
>>- A component that has been built to work with third party tech, like
>>Azure, MongoDB, S3, Druid processors but the later versions of that
>> tech
>>has been changed to something else by the original creators. (Something
>>similar has happened to Azure's DocumentDB which got later rebranded as
>>Azure CosmosDB). In such cases, without deprecating or rebuilding a new
>>processor, this can be used.
>>
>> Before creating a JIRA, I wanted to get the community's thoughts. Feel
>> free to share your thoughts, concerns. If everything seems fine, I'll start
>> working on the implementation.
>>
>> -
>>
>> Sivaprasanna
>>
>
>


java-grok activity

2018-04-13 Thread Otto Fowler
I have been in contact with the maintainer of java-grok about the status of
the project and I am happy to say that there has been
activity today , as
well as some steps to move it forward and pull some forks back in.

https://groups.google.com/forum/#!forum/java-grokhas been created to
discuss the project.

I would recommend anyone interested join up and we can see about getting it
active.

ottO


Re: PutHDFS error

2018-04-13 Thread Sivaprasanna
Glad that you have solved it.. If possible can you share how you resolved
it? It might help others who might face this issue in the future.

-
Sivaprasanna

On Fri, 13 Apr 2018 at 7:05 PM, hemamoger  wrote:

> thanks for your suggestions.
> I have got the answer.
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
>


Re: PutHDFS error

2018-04-13 Thread hemamoger
thanks for your suggestions.
I have got the answer.



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/