+1, this is simplifying operator configuration, we should do same for other DB operators as well.
-Priyanka On Thu, Mar 10, 2016 at 5:34 PM, Bhupesh Chawda <[email protected]> wrote: > Hi All, > > Apart from extending the JDBC Output operator (for update, merge and > delete), I am planning the following change to the "*insert*" > functionality: > > > - It was pointed out that - if the incoming POJO has fields which are to > be inserted into a database table, the user should not have to specify > any > kind of mapping between POJO columns and database table column names. > This > is possible if the POJO field names are same as the db column names. > > > - Here is what I am planning to do to address this: > - Have a flag which indicates "*No mapping required. Assume same POJO > field names as the DB column names.*" > - If this flag is true, extract DB column names (in addition to the > types, which is done currently) and create getters from POJO based > on the > DB column types. > - Use these getters to populate the parametrized "*insert*" query. > > > Note - Once this is done, the user would not have to specify any field > descriptions. Just the DB table name would be sufficient. This would only > be possible if the DB column names are the same as the field names in the > POJO. > Without this change, "*insert*" will be considered as just another DDL > query like "update" / "delete" / "merge". We accept a parametrized query > and a mapping from the user for POJO field names to DB column names + DB > column types. > > Thanks. > Bhupesh > > > On Wed, Dec 30, 2015 at 12:56 PM, Bhupesh Chawda <[email protected]> > wrote: > > > Also, since we are not creating separate operators for insert and > > update/merge, it seems we don't need an abstract class. We can directly > > modify JdbcPOJOOutputOperator. > > > > -Bhupesh > > > > On Wed, Dec 30, 2015 at 11:17 AM, Bhupesh Chawda < > [email protected]> > > wrote: > > > >> Also, a note regarding accepting fieldInfo objects: > >> > >> FieldInfo has fields for java type of the column, but not the SQL data > >> type which is needed to set the parameters in the SQL statement. This > still > >> needs to be derived from the database (as in insert). > >> In more complex scenarios, as mentioned earlier, this may not always be > >> possible. In this case we have the following options: > >> > >> 1. Accept the SQL data types from the user for the parameters ("?") > >> in a new data structure. > >> 2. Accept the JSON structure as specified earlier. > >> 3. Modify FieldInfo to have an additional optional field which > >> indicates the SQL data type of the parameter ("?") > >> > >> I am planning to go ahead with option 3. > >> > >> Comments? > >> > >> -Bhupesh > >> > >> > >> > >> On Mon, Dec 28, 2015 at 11:26 PM, Chandni Singh < > [email protected]> > >> wrote: > >> > >>> Yeah that sounds good. > >>> > >>> Chandni > >>> > >>> On Mon, Dec 28, 2015 at 2:46 AM, Bhupesh Chawda < > [email protected] > >>> > > >>> wrote: > >>> > >>> > Thanks Chandni for the comments. > >>> > > >>> > Additionally I think, we should do away with constructing the SQL > >>> query in > >>> > the operator. This is because it is only possible in case of simple > >>> insert > >>> > statements. In case a where clause is needed in the insert statement, > >>> we > >>> > cannot construct the SQL easily. If we are accepting a parametrized > >>> query > >>> > from the user for update/merge, why not do the same for insert > >>> statements > >>> > as well? Then the hierarchy would look like: > >>> > - AbstractJdbcPOJOOutputOperator (As suggested) > >>> > -- JdbcPOJOOutputOperator (Takes care of all statements insert, > update, > >>> > merge, delete) > >>> > > >>> > We don't need a separate operator for inserts and update/merge. > >>> > > >>> > Thanks. > >>> > -Bhupesh > >>> > > >>> > > >>> > On Mon, Dec 28, 2015 at 3:32 PM, Chandni Singh < > >>> [email protected]> > >>> > wrote: > >>> > > >>> > > 1. FieldInfo, is ultimately a custom object and any user who uses > >>> this > >>> > > operator has to construct an object, populate it and then use > it. > >>> We > >>> > are > >>> > > trying to avoid using any custom object and allow any user to > use > >>> the > >>> > > operator without writing any extra code; just configuration. > >>> > > FieldInfo is a way to provide configuration in a UI friendly way. > >>> > Providing > >>> > > configuration as JSON is not UI friendly. > >>> > > > >>> > > 2. In case of update / merge, we need what SQL data types that > the > >>> > > expression provided by the user would evaluate to. In case of > >>> insert > >>> > we > >>> > > can > >>> > > go and fetch the data types of the columns directly from DB. > >>> However, > >>> > > the > >>> > > same is not possible for custom expressions; "avg(salary)" for > >>> > instance. > >>> > > Ok so here is where you can make a change. > >>> > > - JDBCPojoOutput can be renamed to AbstractJDBCPojoOutpuOperator. > >>> > > - Abstraction is to fetch the type of column. > >>> > > - Add a concrete JDBCPojoInsertOutput that derives the types of > >>> columns > >>> > > directly from DB. Please note that FieldInfo can also provide type > of > >>> > > derived column. > >>> > > > >>> > > > >>> > > You mentioned "We are trying to avoid using any custom object and > >>> allow > >>> > any > >>> > > user to use the > >>> > > operator without writing any extra code". > >>> > > This I think is specific to your use case. You can create an > >>> extension of > >>> > > the above which takes JSON blob and creates FieldInfos from it. > >>> > > > >>> > > Chandni > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > On Mon, Dec 28, 2015 at 1:48 AM, Bhupesh Chawda < > >>> [email protected] > >>> > > > >>> > > wrote: > >>> > > > >>> > > > Hi Chandni, > >>> > > > > >>> > > > Following are the issues: > >>> > > > > >>> > > > 1. FieldInfo, is ultimately a custom object and any user who > >>> uses > >>> > this > >>> > > > operator has to construct an object, populate it and then use > >>> it. We > >>> > > are > >>> > > > trying to avoid using any custom object and allow any user to > >>> use > >>> > the > >>> > > > operator without writing any extra code; just configuration. > >>> > > > 2. In case of update / merge, we need what SQL data types that > >>> the > >>> > > > expression provided by the user would evaluate to. In case of > >>> insert > >>> > > we > >>> > > > can > >>> > > > go and fetch the data types of the columns directly from DB. > >>> > However, > >>> > > > the > >>> > > > same is not possible for custom expressions; "avg(salary)" for > >>> > > instance. > >>> > > > > >>> > > > Thanks. > >>> > > > > >>> > > > -Bhupesh > >>> > > > > >>> > > > > >>> > > > On Mon, Dec 28, 2015 at 2:57 PM, Chandni Singh < > >>> > [email protected]> > >>> > > > wrote: > >>> > > > > >>> > > > > Hi Bhupesh, > >>> > > > > > >>> > > > > JDBCPojoOutputOperator was written for a demo and therefore it > >>> was > >>> > > marked > >>> > > > > Evolving which is why I had mentioned that you should feel free > >>> to > >>> > > modify > >>> > > > > it. > >>> > > > > > >>> > > > > I think an insert query can be as complex as any other query. > It > >>> uses > >>> > > > > FieldInfo because in the app builder it is easy for the user to > >>> > provide > >>> > > > > that instead of JSON String. > >>> > > > > > >>> > > > > Can you please provide specifics about what it is that you find > >>> > > difficult > >>> > > > > to change/implement for providing update/merge/delete support > in > >>> > > > > JDBCPojoOutputOperator? > >>> > > > > > >>> > > > > Chandni > >>> > > > > > >>> > > > > On Mon, Dec 28, 2015 at 1:09 AM, Bhupesh Chawda < > >>> > > [email protected] > >>> > > > > > >>> > > > > wrote: > >>> > > > > > >>> > > > > > Hi All, > >>> > > > > > > >>> > > > > > It has been pointed out that adding another class for > handling > >>> > > update / > >>> > > > > > merge queries would not be a good option. > >>> > > > > > > >>> > > > > > Here are the current implementation details: > >>> > > > > > > >>> > > > > > - We have an existing class: JdbcPOJOOutputOperator which > >>> > accepts > >>> > > a > >>> > > > > list > >>> > > > > > of FieldInfo objects. Each element of this list indicates > >>> the > >>> > > > details > >>> > > > > > (column name, pojo field expression and datatype) of > fields > >>> that > >>> > > > need > >>> > > > > > to be > >>> > > > > > inserted. Using this, the operator formulates the insert > >>> query > >>> > in > >>> > > > the > >>> > > > > > setup > >>> > > > > > method and identifies the sql datatypes of these columns > >>> from > >>> > the > >>> > > > > > database > >>> > > > > > using the table name. > >>> > > > > > > >>> > > > > > > >>> > > > > > - Now, coming to the update / merge feature, it is > >>> difficult to > >>> > > > > > formulate the update / merge query in the operator logic > >>> due to > >>> > > the > >>> > > > > > complex > >>> > > > > > structure of these statements. For this reason, we plan to > >>> take > >>> > a > >>> > > > > > parametrized SQL query from the user. This may look like: > >>> > *"update > >>> > > > > table > >>> > > > > > set x = ?, y = ? where z + w > ? and a == 1;"*. Such > >>> statements > >>> > > can > >>> > > > be > >>> > > > > > accepted from the user in addition to a json string which > >>> > > indicates > >>> > > > > the > >>> > > > > > details for the parameters: *column name, the pojo > >>> expression > >>> > and > >>> > > > the > >>> > > > > > sql data type* of the expression. Note that this > >>> information is > >>> > > > > similar > >>> > > > > > to the FieldInfo object, but is a string which can be > >>> configured > >>> > > > > easily > >>> > > > > > by > >>> > > > > > the user. Also note that the data type which is accepted > is > >>> the > >>> > > SQL > >>> > > > > data > >>> > > > > > type. Using this info we can populate the parametrized > >>> query and > >>> > > run > >>> > > > > it > >>> > > > > > based on the incoming POJO. > >>> > > > > > > >>> > > > > > The second approach is able to handle all kinds of queries > >>> (insert > >>> > / > >>> > > > > update > >>> > > > > > / merge / delete). However, since we already have the > >>> > > > > > JdbcPOJOOutputOperator, we would like to merge the new > >>> > functionality > >>> > > > into > >>> > > > > > the same class. > >>> > > > > > > >>> > > > > > Here we have the following options: > >>> > > > > > > >>> > > > > > 1. Change the existing class (JdbcPOJOOutputOperator) to > the > >>> > > second > >>> > > > > > approach which is more generic and also handles inserts. > >>> > > > > > 2. Add the update/ merge functionality to the existing > class > >>> > > without > >>> > > > > > changing the existing functionality. This will have two > >>> > different > >>> > > > ways > >>> > > > > > that > >>> > > > > > insert queries may be handled in the operator. > >>> > > > > > 3. Add another class which extends from > >>> JdbcPOJOOutputOperator > >>> > and > >>> > > > > have > >>> > > > > > the update/merge functionality there. (This is not > >>> recommended.) > >>> > > > > > 4. Any other approach. > >>> > > > > > > >>> > > > > > Please suggest. > >>> > > > > > > >>> > > > > > Thanks. > >>> > > > > > > >>> > > > > > -Bhupesh. > >>> > > > > > > >>> > > > > > > >>> > > > > > On Mon, Dec 14, 2015 at 11:04 AM, Chandni Singh < > >>> > > > [email protected] > >>> > > > > > > >>> > > > > > wrote: > >>> > > > > > > >>> > > > > > > No I don't think we are restricting Malhar to just abstract > >>> > > classes. > >>> > > > > > > Whenever they are couple of use cases that we see quite > >>> often, we > >>> > > add > >>> > > > > > > concrete implementations. > >>> > > > > > > > >>> > > > > > > For eg. FileLineInputOperator which is a concrete > >>> implementation > >>> > of > >>> > > > > > > AbstractFileInputOperator. FSSliceReader is an example as > >>> well. > >>> > > > > > > > >>> > > > > > > In AbstractJdbcOutputOperator case there hasn't been such > >>> common > >>> > > > > > > insert/update query. > >>> > > > > > > > >>> > > > > > > Also if you look at the example I provided, it is very > >>> simple to > >>> > > > > provide > >>> > > > > > a > >>> > > > > > > concrete implementation. > >>> > > > > > > > >>> > > > > > > If you would like to change JdbcPOJOOutputOperator to work > >>> for > >>> > > > > > > "UPDATE/MERGE" then please go ahead. > >>> > > > > > > > >>> > > > > > > Chandni > >>> > > > > > > > >>> > > > > > > On Sun, Dec 13, 2015 at 8:47 PM, Bhupesh Chawda < > >>> > > > > [email protected] > >>> > > > > > > > >>> > > > > > > wrote: > >>> > > > > > > > >>> > > > > > > > I see. So, just to understand more, do we plan to keep > >>> Malhar > >>> > > > > > restricted > >>> > > > > > > to > >>> > > > > > > > the base functionality (as in abstract classes)? And put > >>> the > >>> > > > > > > configuration > >>> > > > > > > > aspect / concrete implementations in apps that use these > >>> > > operators? > >>> > > > > > > > > >>> > > > > > > > Thanks. > >>> > > > > > > > Bhupesh > >>> > > > > > > > > >>> > > > > > > > On Sat, Dec 12, 2015 at 5:43 AM, Chandni Singh < > >>> > > > > > [email protected]> > >>> > > > > > > > wrote: > >>> > > > > > > > > >>> > > > > > > > > Hi, > >>> > > > > > > > > > >>> > > > > > > > > Here is an example of doing Upsert with JDBC: > >>> > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > https://github.com/chandnisingh/Malhar/blob/examples/apps/jdbc/src/main/java/com/datatorrent/jdbc/JdbcWriter.java > >>> > > > > > > > > > >>> > > > > > > > > Thanks, > >>> > > > > > > > > Chandni > >>> > > > > > > > > > >>> > > > > > > > > On Fri, Dec 11, 2015 at 11:19 AM, Chandni Singh < > >>> > > > > > > [email protected] > >>> > > > > > > > > > >>> > > > > > > > > wrote: > >>> > > > > > > > > > >>> > > > > > > > > > The operators are under Malhar/lib/db/jdbc. > >>> > > > > > > > > > > >>> > > > > > > > > > Here is one of them: > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > https://github.com/apache/incubator-apex-malhar/blob/devel-3/library/src/main/java/com/datatorrent/lib/db/jdbc/AbstractJdbcTransactionableOutputOperator.java > >>> > > > > > > > > > > >>> > > > > > > > > > They work with any kind PreparedStatement - insert or > >>> > update > >>> > > > > > > > > > > >>> > > > > > > > > > Chandni > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > On Fri, Dec 11, 2015 at 10:59 AM, Bhupesh Chawda < > >>> > > > > > > > > [email protected]> > >>> > > > > > > > > > wrote: > >>> > > > > > > > > > > >>> > > > > > > > > >> Hi Chandni, > >>> > > > > > > > > >> > >>> > > > > > > > > >> I don't see an update query being handled in the > >>> operator. > >>> > > > Could > >>> > > > > > you > >>> > > > > > > > > >> please > >>> > > > > > > > > >> point me to the appropriate class? > >>> > > > > > > > > >> Or did you mean that handling a update query is > just a > >>> > > matter > >>> > > > of > >>> > > > > > > > > extending > >>> > > > > > > > > >> the class and providing a concrete implementation? > >>> > > > > > > > > >> > >>> > > > > > > > > >> Thanks. > >>> > > > > > > > > >> -Bhupesh > >>> > > > > > > > > >> > >>> > > > > > > > > >> On Fri, Dec 11, 2015 at 10:42 PM, Chandni Singh < > >>> > > > > > > > > [email protected]> > >>> > > > > > > > > >> wrote: > >>> > > > > > > > > >> > >>> > > > > > > > > >> > Hi Bhupesh, > >>> > > > > > > > > >> > > >>> > > > > > > > > >> > The current abstract JDBC Output Operators in > >>> library > >>> > are > >>> > > > > > generic > >>> > > > > > > > and > >>> > > > > > > > > >> have > >>> > > > > > > > > >> > already been used in multiple POCs and > >>> applications. In > >>> > > fact > >>> > > > > > this > >>> > > > > > > > > >> operator > >>> > > > > > > > > >> > has matured through customer use cases. It is not > >>> just > >>> > an > >>> > > > > insert > >>> > > > > > > > > >> operator. > >>> > > > > > > > > >> > We have used it to perform update and inserts. > >>> > > > > > > > > >> > > >>> > > > > > > > > >> > That said, I don't think it is a good idea to > >>> introduce > >>> > > > input > >>> > > > > > > format > >>> > > > > > > > > in > >>> > > > > > > > > >> > these abstract implementations. It is written to > >>> handle > >>> > > any > >>> > > > > > type > >>> > > > > > > of > >>> > > > > > > > > >> query, > >>> > > > > > > > > >> > be it a procedure call (that was an actual > customer > >>> use > >>> > > > case). > >>> > > > > > > > > >> > > >>> > > > > > > > > >> > Chandni > >>> > > > > > > > > >> > > >>> > > > > > > > > >> > On Fri, Dec 11, 2015 at 2:50 AM, Bhupesh Chawda < > >>> > > > > > > > > >> [email protected]> > >>> > > > > > > > > >> > wrote: > >>> > > > > > > > > >> > > >>> > > > > > > > > >> > > Hi All, > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > We are planning to proceed with the following > >>> approach > >>> > > for > >>> > > > > > JDBC > >>> > > > > > > > > >> *update* > >>> > > > > > > > > >> > > operator: > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > - *Update Query Configuration* > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > - Example Update Query: *update tableName set > >>> a = > >>> > ?** > >>> > > > > > where b > >>> > > > > > > > = ? > >>> > > > > > > > > >> and > >>> > > > > > > > > >> > c > >>> > > > > > > > > >> > > > ?;* > >>> > > > > > > > > >> > > - Example JSON input array for parameter > >>> > > > instantiations: > >>> > > > > > > *[{a, > >>> > > > > > > > > >> > > expression, INTEGER}, {b, expression, > >>> VARCHAR}, {c, > >>> > > > > > > expression, > >>> > > > > > > > > >> > DATE}]* > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > We are also planning to change the JDBC Output > >>> > Operator > >>> > > in > >>> > > > > > > Malhar > >>> > > > > > > > > >> Library > >>> > > > > > > > > >> > > which currently does just insert. We plan to > make > >>> the > >>> > > > input > >>> > > > > > > format > >>> > > > > > > > > >> > > consistent for both insert and update and hence > >>> the > >>> > > change > >>> > > > > to > >>> > > > > > > the > >>> > > > > > > > > >> current > >>> > > > > > > > > >> > > way of configuration using JSON. Following would > >>> be > >>> > the > >>> > > > > config > >>> > > > > > > for > >>> > > > > > > > > >> > inserts: > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > - *Insert Query Configuration* > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > - Example Insert Query: *insert into > tableName > >>> > values > >>> > > > (?, > >>> > > > > > ?, > >>> > > > > > > > .. , > >>> > > > > > > > > >> ?);* > >>> > > > > > > > > >> > > - Example JSON input array for parameter > >>> > > > instantiations: > >>> > > > > > > *[{a, > >>> > > > > > > > > >> > > expression, INTEGER}, {b, expression, > >>> VARCHAR}, .. > >>> > , > >>> > > > {c, > >>> > > > > > > > > >> expression, > >>> > > > > > > > > >> > > DATE}]* > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > Please let us know your thoughts. > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > Thanks. > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > -Bhupesh > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > On Wed, Dec 9, 2015 at 6:38 PM, Bhupesh Chawda < > >>> > > > > > > > > >> [email protected]> > >>> > > > > > > > > >> > > wrote: > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > > Hi All, > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > Would it be a good idea to introduce the > update > >>> > > > > > functionality > >>> > > > > > > to > >>> > > > > > > > > the > >>> > > > > > > > > >> > JDBC > >>> > > > > > > > > >> > > > output operator in Apache Apex Malhar library. > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > The following are possible approaches: > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > 1. Accept a update query from the user with > >>> place > >>> > > > > holders > >>> > > > > > > for > >>> > > > > > > > > >> > values. > >>> > > > > > > > > >> > > > Example: *update tableName set a = ?, b = ? > >>> > where c > >>> > > > = ? > >>> > > > > > and > >>> > > > > > > > d > > >>> > > > > > > > > >> ?*. > >>> > > > > > > > > >> > > > Here "?" will be provided by the user as > java > >>> > > > > expressions > >>> > > > > > > > which > >>> > > > > > > > > >> will > >>> > > > > > > > > >> > > be > >>> > > > > > > > > >> > > > evaluated from the incoming tuple. > >>> > > > > > > > > >> > > > 2. Another option is to accept in some > >>> > > configuration > >>> > > > > > format > >>> > > > > > > > > >> (json / > >>> > > > > > > > > >> > > > xml) the following and formulate the query > >>> in the > >>> > > > > > operator. > >>> > > > > > > > > This > >>> > > > > > > > > >> can > >>> > > > > > > > > >> > > become > >>> > > > > > > > > >> > > > arbitrarily complex. > >>> > > > > > > > > >> > > > 1. update clause columns > >>> > > > > > > > > >> > > > 2. update clause expressions > >>> > > > > > > > > >> > > > 3. where clause columns > >>> > > > > > > > > >> > > > 4. where clause expressions > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > I am thinking about going ahead with 1. Please > >>> let > >>> > me > >>> > > > know > >>> > > > > > if > >>> > > > > > > > any > >>> > > > > > > > > >> other > >>> > > > > > > > > >> > > > option is possible and whether such a > >>> functionality > >>> > > > > already > >>> > > > > > > > exists > >>> > > > > > > > > >> in > >>> > > > > > > > > >> > > some > >>> > > > > > > > > >> > > > other class. > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > Thanks. > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > -Bhupesh > >>> > > > > > > > > >> > > > > >>> > > > > > > > > >> > > > >>> > > > > > > > > >> > > >>> > > > > > > > > >> > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > >> > >> > > >
