Re: v0.* QueryDatabaseTable vs v1 GenerateTableFetch

2016-08-15 Thread Matt Burgess
Oops sorry, had replied before I saw this :)


> On Aug 15, 2016, at 11:15 PM, Peter Wicks (pwicks)  wrote:
> 
> Oh, disregard J. I misread GenerateTableFetch as being an actual data fetch 
> vs a query builder.
>  
> From: Peter Wicks (pwicks) 
> Sent: Monday, August 15, 2016 9:11 PM
> To: 'users@nifi.apache.org' 
> Subject: v0.* QueryDatabaseTable vs v1 GenerateTableFetch
>  
> What is the future of QueryDatabaseTable? Unless I’m misunderstanding how it 
> works it looks like GenerateTableFetch can do everything QueryDatabaseTable 
> can do and then some.  Is there a plan to phase out QueryDatabaseTable?  Is 
> there a reason for a new processor instead of an update to 
> QueryDatabaseTable? It looked like the only user facing change was result 
> paging, which could have had a default of 0 (no paging).  Just curious.
>  
> Regards,
>   Peter


Re: v0.* QueryDatabaseTable vs v1 GenerateTableFetch

2016-08-15 Thread Matt Burgess
Peter,

Another difference between the two (besides the paging) is that 
QueryDatabaseTable executes SQL and GenerateTableFetch generates SQL. With the 
paging capability (which with Remote Process Groups enables distributed fetch a 
la Sqoop), you're likely correct that GTF will replace / deprecate QDB.

Adding paging to QDB would not have given real distributed fetch, as multiple 
tasks of QDB would not be able to get the correct pages since they each execute 
the same code in parallel. GTF is designed to be run on the primary node only, 
but would be connected (likely via RPG) to ExecuteSQL which can be run in 
parallel.

Regards,
Matt

> On Aug 15, 2016, at 11:10 PM, Peter Wicks (pwicks)  wrote:
> 
> What is the future of QueryDatabaseTable? Unless I’m misunderstanding how it 
> works it looks like GenerateTableFetch can do everything QueryDatabaseTable 
> can do and then some.  Is there a plan to phase out QueryDatabaseTable?  Is 
> there a reason for a new processor instead of an update to 
> QueryDatabaseTable? It looked like the only user facing change was result 
> paging, which could have had a default of 0 (no paging).  Just curious.
>  
> Regards,
>   Peter


RE: v0.* QueryDatabaseTable vs v1 GenerateTableFetch

2016-08-15 Thread Peter Wicks (pwicks)
Oh, disregard :). I misread GenerateTableFetch as being an actual data fetch vs 
a query builder.

From: Peter Wicks (pwicks)
Sent: Monday, August 15, 2016 9:11 PM
To: 'users@nifi.apache.org' 
Subject: v0.* QueryDatabaseTable vs v1 GenerateTableFetch

What is the future of QueryDatabaseTable? Unless I'm misunderstanding how it 
works it looks like GenerateTableFetch can do everything QueryDatabaseTable can 
do and then some.  Is there a plan to phase out QueryDatabaseTable?  Is there a 
reason for a new processor instead of an update to QueryDatabaseTable? It 
looked like the only user facing change was result paging, which could have had 
a default of 0 (no paging).  Just curious.

Regards,
  Peter


v0.* QueryDatabaseTable vs v1 GenerateTableFetch

2016-08-15 Thread Peter Wicks (pwicks)
What is the future of QueryDatabaseTable? Unless I'm misunderstanding how it 
works it looks like GenerateTableFetch can do everything QueryDatabaseTable can 
do and then some.  Is there a plan to phase out QueryDatabaseTable?  Is there a 
reason for a new processor instead of an update to QueryDatabaseTable? It 
looked like the only user facing change was result paging, which could have had 
a default of 0 (no paging).  Just curious.

Regards,
  Peter


Re: Exception on Processor ConvertJSONToSQL

2016-08-15 Thread Bryan Bende
Carlos/Peter,

Thanks for reporting this issue. It seems IS_AUTOINCREMENT is causing
problems in a couple of situations, I know there was another issue with
Hive where they return  IS_AUTO_INCREMENT rather than  IS_AUTOINCREMENT.

We should definitely address this issue... would either of you be
interested in contributing your fix as a patch or PR? The try/catch
approach seems reasonable to me assuming there is no method on ResultSet to
check if the column exists.

As far as being able to remove ConvertAvroToJson if ExecuteSQL could
produce JSON... I think we initially went with Avro to preserve the most
information about the schema/types coming from the database, but I do
agree that many time the next step is to immediately convert it to some
other format. I think it would be ideal if there was a pluggable approach
when generating the output of these result sets so that any processors
dealing
with rows of data could provide an option to the user to select the output
format and choose from options like Avro, JSON, CSV. There likely needs to
be some more design discussion around it and of course development time :)

Thanks,

Bryan

On Mon, Aug 15, 2016 at 11:16 AM, Peter Wicks (pwicks) 
wrote:

> Carlos,
>
>
>
> I ran into this same error when querying Teradata. It looks like a lot of
> databases don’t include this.
>
> I submitted a bug a couple weeks ago: https://issues.apache.org/
> jira/browse/NIFI-2356
>
>
>
> I did something similar to your suggestion locally in a modified version
> of the code.
>
>
>
> Regards,
>
>   Peter
>
>
>
>
>
>
>
> *From:* Carlos Manuel Fernandes (DSI) [mailto:carlos.antonio.
> fernan...@cgd.pt]
> *Sent:* Thursday, August 11, 2016 9:20 AM
> *To:* users@nifi.apache.org
> *Subject:* Exception on Processor ConvertJSONToSQL
>
>
>
> Hi All,
>
>
>
> I am making some tests to move data from Db2 to Netezza using Nifi.   If I
> don’t use costume processors,   it’s a  indirect away :
>
>
>
> ExecuteSQL(on db2) -> ConvertAvroToJSON -> ConvertJSONToSQL -> PutSQL
> (bulk on netezza)
>
>
>
> and  this away, I have an Exception on ConvertJSONToSQL:
>
> org.netezza.error.NzSQLException: The column name IS_AUTOINCREMENT not
> found.
>
> at org.netezza.sql.NzResultSet.findColumn(NzResultSet.java:266)
> ~[nzjdbc.jar:na]
>
> at org.netezza.sql.NzResultSet.getString(NzResultSet.java:1407)
> ~[nzjdbc.jar:na]
>
> at 
> org.apache.commons.dbcp.DelegatingResultSet.getString(DelegatingResultSet.java:263)
> ~[na:na]
>
> at org.apache.nifi.processors.standard.ConvertJSONToSQL$
> ColumnDescription.from(ConvertJSONToSQL.java:678)
> ~[nifi-standard-processors-0.7.0.jar:0.7.0]
>
>
>
> Netezza jdbc driver doesn’t implement *IS_AUTOINCREMENT *metadata column
> ( the same is true for oracle driver). Probably the reason is Netezza and
> Oracle don’t have incremental columns because they use Sequences for this
> purpose.
>
>
>
> On possible solution it to put a try catch (isn’t beautiful) around
>
> final String autoIncrementValue = resultSet.getString("IS_AUTOINCREMENT");
> (ConvertJSONToSQL.java:678)
>
> and on the catch, put autoIncrementValue=’NO’
>
>
>
>
>
> Besides this error , we can remove  on  step ConvertAvroToJSON  in the
> flow  if  ExecuteSQL  is changed to generate optional
>
> Output: Avro or JSON.
>
>
>
> What you Think?
>
>
>
> Thanks
>
>
>
> Carlos
>
>
>
>
>
>
>
>
>
>
>
>
>


RE: PutSQL ERROR bulletin

2016-08-15 Thread Peter Wicks (pwicks)
Sven,

I've run into this a couple times.  In my case some records would insert and 
some would not.  To find my issue:

 - I routed all failures back to PutSQL
 - Reduced the batch size down to about 10
 - Changed the prioritization on the failure relationship so that hopefully 
failures will move to the back.

I then let it run until my failure count had stabilized. I stopped the 
processor and looked at the values. I then built and executed a SQL statement 
by hand in my SQL editor.

In my case we found two issues:
 - We had Unicode characters being inserted into a non-unicode field (NVARCHAR 
into VARCHAR)
 - PutSQL requires timestamps to be timestamps, but if you are using a 
JSONToSQL processor up stream it expects Timestamps to be epoch's, so it can 
convert them back to timestamps... This was one of our big issues.

Regards,
  Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Friday, August 12, 2016 5:11 PM
To: users@nifi.apache.org
Subject: Re: PutSQL ERROR bulletin

Hello Sven

LogAttributes will just show the attributes as they are understood by the flow 
file itself.  But the PutSQL processor may be doing something more specific 
with the data.  Can you share your configuration for PutSQL?

Thanks
Joe

On Fri, Aug 12, 2016 at 7:05 PM, Sven Davison  wrote:
> Actualy… I’m fairly sure I found it. I sent stuff off to “logAttribute”
> processor and found the input is not escaped.
>
>
>
> http://prntscr.com/c51je3
>
>
>
>
>
>
>
> Sent from Mail for Windows 10
>
>
>
> From: Sven Davison
> Sent: Friday, August 12, 2016 7:02 PM
> To: users@nifi.apache.org
> Subject: PutSQL ERROR bulletin
>
>
>
> I’m getting several inserts but every once in a while (every 1-2 
> minutes or so)… I get this error. Anyone know what might cause this?
>
>
> PutSQL[id=b8d54aa6-567a-4686-96bc-1c00e5d43461] Failed to update 
> database due to a failed batch update. There were a total of 1 
> FlowFiles that failed,
> 0 that succeeded, and 0 that were not execute and will be routed to 
> retry;
>
>
>
>
>
> -Sven
>
>
>
> Sent from Mail for Windows 10
>
>
>
>


RE: Exception on Processor ConvertJSONToSQL

2016-08-15 Thread Peter Wicks (pwicks)
Carlos,

I ran into this same error when querying Teradata. It looks like a lot of 
databases don't include this.
I submitted a bug a couple weeks ago: 
https://issues.apache.org/jira/browse/NIFI-2356

I did something similar to your suggestion locally in a modified version of the 
code.

Regards,
  Peter



From: Carlos Manuel Fernandes (DSI) [mailto:carlos.antonio.fernan...@cgd.pt]
Sent: Thursday, August 11, 2016 9:20 AM
To: users@nifi.apache.org
Subject: Exception on Processor ConvertJSONToSQL

Hi All,

I am making some tests to move data from Db2 to Netezza using Nifi.   If I 
don't use costume processors,   it's a  indirect away :

ExecuteSQL(on db2) -> ConvertAvroToJSON -> ConvertJSONToSQL -> PutSQL (bulk on 
netezza)

and  this away, I have an Exception on ConvertJSONToSQL:
org.netezza.error.NzSQLException: The column name IS_AUTOINCREMENT not found.
at org.netezza.sql.NzResultSet.findColumn(NzResultSet.java:266) 
~[nzjdbc.jar:na]
at org.netezza.sql.NzResultSet.getString(NzResultSet.java:1407) 
~[nzjdbc.jar:na]
at 
org.apache.commons.dbcp.DelegatingResultSet.getString(DelegatingResultSet.java:263)
 ~[na:na]
at 
org.apache.nifi.processors.standard.ConvertJSONToSQL$ColumnDescription.from(ConvertJSONToSQL.java:678)
 ~[nifi-standard-processors-0.7.0.jar:0.7.0]

Netezza jdbc driver doesn't implement IS_AUTOINCREMENT metadata column ( the 
same is true for oracle driver). Probably the reason is Netezza and Oracle 
don't have incremental columns because they use Sequences for this purpose.

On possible solution it to put a try catch (isn't beautiful) around
final String autoIncrementValue = resultSet.getString("IS_AUTOINCREMENT");  
(ConvertJSONToSQL.java:678)
and on the catch, put autoIncrementValue='NO'


Besides this error , we can remove  on  step ConvertAvroToJSON  in the flow  if 
 ExecuteSQL  is changed to generate optional
Output: Avro or JSON.

What you Think?

Thanks

Carlos