Re: [DISCUSS] get back update status after invoking executeUpdate(...)

Rohit Mon, 12 Aug 2013 06:16:32 -0700

Thanks for this.
I've 2 queries :

1. Instead of having 3 methods why can't we define an Enum (Delete, Update, 
Insert) and use only one method
which accepts this enum and returns the number of affected rows corresponding 
to that operation ?


2. How are we counting the affected rows ? Do we have collection of affected 
objects to do so (for update, delete and insert) ?

Yes we can discuss the async operation(s) in another Jira issue. Thanks again.

Cheers,
-RS


On 12-Aug-2013, at 1:09 PM, Kasper Sørensen <[email protected]> 
wrote:

> Let's watch out not to meddle the two features we're talking about.
> 
> I have been bringing this story forward because I have seen usages where we
> would like to get an indication of the number of affected rows by some
> update script. This would still be in a traditional blocking mode. And it
> is in line with proposing the three methods (or four, if a total is also to
> be provided) mentioned above.
> 
> The thing you're proposing is something else in my opinion. I would dub
> that 'asynchronuous/non-blocking/reactive update execution' or something
> like that, and I would prefer offering such a feature in a different method
> since .executeUpdate already has a semantic to it which is blocking.
> Offering a non-blocking variant of the method impacts everything about the
> method signature - the return type (a future-like object), the arguments (a
> listener, both for success and failure) and the exceptions thrown (none,
> since the execution is happening asynchronuously). I am all open for taking
> on this kind of update method as well, but maybe we should take that in a
> separate email thread / JIRA issue (if we agree that it is a entirely
> different method)?
> 
> Kasper
> 
> 
> 2013/8/12 rohit sharma <[email protected]>
> 
>> Agreed ..for returning the list of affected rows. I think then we need not
>> require following methods:
>> public int getInsertedRows();
>> public int getUpdatedRows();
>> public int getDeletedRows();
>> 
>> As per below code an UpdateStatus object is communicating the status of
>> UpdateScript's run method, not individual insert, delete or update
>> operations.
>> 
>> UpdateStatus status = dataContext.executeUpdate(new UpdateScript() {
>>>  public void run(UpdateCallback cb) {
>>>    cb.insertInto(table).value(...).execute();
>>>    cb.insertInto(table).value(...).execute();
>>>    cb.update(table).where(...).value(...).execute();
>>>    cb.deleteFrom(table).where(...).execute();
>>>  }
>>> });
>> 
>> I think we need following methods :
>> 1. Method to state if UpdateScript executed. However, this will not
>> indicate any failure or success.
>> 2. Method(s) to indicate is execution was successful or failure.
>> 3. Method to communicate custom data/info/any results from run method.
>> I think this is becoming like Future.
>> 
>> 
>> On Sun, Aug 11, 2013 at 3:30 PM, Kasper Sørensen <
>> [email protected]> wrote:
>> 
>>> The main development branch is 'master'. The .executeUpdate(...) method
>> is
>>> in the subinterface UpdateableDataContext.
>>> 
>>> My point is that you can provide an update script with many operations in
>>> one go. Take this example:
>>> 
>>> UpdateStatus status = dataContext.executeUpdate(new UpdateScript() {
>>>>  public void run(UpdateCallback cb) {
>>>>    cb.insertInto(table).value(...).execute();
>>>>    cb.insertInto(table).value(...).execute();
>>>>    cb.update(table).where(...).value(...).execute();
>>>>    cb.deleteFrom(table).where(...).execute();
>>>>  }
>>>> });
>>> 
>>> 
>>> In this example we know that 2 records have been inserted. Some amount of
>>> records have been updated (that may or may not be reported by the backing
>>> datastore) and some amount of records have been deleted (that may or may
>>> not have been reported by the backing datastore).
>>> 
>>> Getting a list of affected records would be very complicated IMO, and
>> would
>>> only be consistent if done at the time of the transaction. That means
>> that
>>> if we even figure out how to do it, we also have to do it eagerly, which
>> is
>>> obviously not a good thing to do since I think getting that list is a
>>> feature that would not apply to most use cases.
>>> 
>>> The idea about doing this in a multithreaded/nonblocking fashion is
>>> interesting. It would definately have to be an option in my opinion,
>> since
>>> I think the interface right now indicates that the executeUpdate method
>>> will return succesfully when the update is done (ie. it throws those
>>> exceptions that may arise during the update, and an unblocking variant
>>> would not be able to do that).
>>> 
>>> 
>>> 2013/8/11 Rohit <[email protected]>
>>> 
>>>> 
>>>> My point is why do we need 3 methods to provide the status ? We can use
>>>> one method to provide the status (Returns a boolean if update
>> operation a
>>>> done or not) and other to provide a list of affected rows/objects keys,
>>>> this would be less memory intensive.
>>>> And in case of deletion we can throw a custom exception stating " since
>>>> this is a delete operation we cannot provide affected rows".
>>>>>>>>> public interface UpdateStatus {
>>>>>>>>>  public <List or Collection> getAffectedRowsKeys(); or use
>>>> getRowsKeys(); or getKeys() throws MetaModelException;
>>>>                   public Boolean isDone();
>>>>>>>>> }
>>>> 
>>>> 
>>>> I see following remote branches in git repo:
>>>>  remotes/origin/HEAD -> origin/master
>>>>  remotes/origin/hbase-module
>>>>  remotes/origin/master
>>>> 
>>>> Which is the current development branch as I couldn't find
>> executeUpdate
>>>> in DataContext ?
>>>> 
>>>> On 11-Aug-2013, at 1:04 PM, Kasper Sørensen <
>>>> [email protected]> wrote:
>>>> 
>>>>> I agree on the reasoning here, but getting a list of affected rows
>> may
>>>>> prove to be both quite difficult (from consistency point of view it
>>>>> needs to be created immediately after some update - how can that be
>>>>> implemented in an effecient manner (lazy, since you wouldn't always
>>>>> want to get this list, but then consistency is gone), and how to
>>>>> guarantee that those rows are even available (e.g. after a DELETE
>>>>> statement in a JDBC database, you cannot get the rows anymore)), and
>>>>> too memory intensive (updates can be anything from a single record to
>>>>> millions of records).
>>>>> 
>>>>> That's why I think the three methods (maybe with an additional fourth
>>>>> method, returning the total num. of affected rows) is the best
>> initial
>>>>> set of methods.
>>>>> 
>>>>> 2013/8/11 Rohit <[email protected]>:
>>>>>> Having an abstraction for communicating status would be a cleaner
>>>> approach. As it allows us to extend the functionality if required in
>>> future.
>>>>>> For Example:
>>>>>> Lets say we are executing executeUpdate in multithreaded env. then
>> we
>>>> can provide methods to check the status of the update. (running or
>> done).
>>>>>> 
>>>>>> Also rather than having methods returning int I think we should
>> return
>>>> a collection of Affected Rows.
>>>>>> And rather than having multiple methods for insert/delete/update
>> just
>>>> have a single method returning the affected rows.
>>>>>> 
>>>>>> I think making so will make the API simpler yet functional.
>>>>>> 
>>>>>>>>> public interface UpdateStatus {
>>>>>>>>> public <List or Collection> getAffectedRows(); or just use
>>> getRows()
>>>>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> -RS
>>>>>> 
>>>>>> On 10-Aug-2013, at 10:09 PM, Kasper Sørensen <
>>>> [email protected]> wrote:
>>>>>> 
>>>>>>> My experience with returning something like an
>>>>>>> int/String/other-simple-type is that it seems to fit the purpose at
>>>>>>> first, but then someone asks "couldn't we also get the status of
>>>>>>> [something], too?" ... and then you regret not encapsulating the
>>>>>>> return value so that you could just add another getter on it.
>>>>>>> 
>>>>>>> 2013/8/10 Henry Saputra <[email protected]>:
>>>>>>>> We could just make the executeUpdate just return number of
>> elements
>>>>>>>> affected similar to relational databases.
>>>>>>>> 
>>>>>>>> It would be simpler and serve similar purpose.
>>>>>>>> 
>>>>>>>> - Henry
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Aug 9, 2013 at 12:15 AM, Kasper Sørensen <
>>>>>>>> [email protected]> wrote:
>>>>>>>> 
>>>>>>>>> Allow me to also bump this issue.
>>>>>>>>> 
>>>>>>>>> Since this would be a change that would not be compile-time
>>>>>>>>> incompatible with previous versions, I suggest we introduce at
>>> least
>>>>>>>>> the return type of the method now. Since everyone moving the
>> Apache
>>>>>>>>> MetaModel from the old metamodel would anyways have to recompile,
>>>> they
>>>>>>>>> would not feel the runtime incompatibility of void vs.
>>>> SomeReturnType.
>>>>>>>>> 
>>>>>>>>> By now I suggest we just keep a very simple returned interface
>>> which
>>>>>>>>> we can always choose to expand (implemented internally, not by
>>>>>>>>> consumers). For instance something like:
>>>>>>>>> 
>>>>>>>>> public interface UpdateStatus {
>>>>>>>>> public int getInsertedRows();
>>>>>>>>> public int getUpdatedRows();
>>>>>>>>> public int getDeletedRows();
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> This should be quite easy to implement once and reuse for most of
>>> the
>>>>>>>>> modules. For some datastores it might not be information that we
>>> can
>>>>>>>>> retrieve, so the interface should also specify an extraordinary
>>> rule;
>>>>>>>>> e.g. "if -1 is returned it means that it is not known how many
>> rows
>>>>>>>>> where inserted/updated/deleted"
>>>>>>>>> 
>>>>>>>>> Kasper
>>>>>>>>> 
>>>>>>>>> 2013/7/24 Kasper Sørensen <[email protected]>:
>>>>>>>>>> In the current API design of MetaModel, the
>>>>>>>>> DataContext.executeUpdate(...)
>>>>>>>>>> method is a void method. This was initially chosen because not
>> all
>>>>>>>>>> implementations have the capability to report anything about a
>>>> particular
>>>>>>>>>> update. But some do, for instance the no. of inserted, updated
>> or
>>>> deleted
>>>>>>>>>> records from a JDBC call. It would be nice to expose this
>>>> information
>>>>>>>>> when
>>>>>>>>>> available.
>>>>>>>>>> 
>>>>>>>>>> My suggestion for this would be to let the
>>>> DataContext.executeUpdate(...)
>>>>>>>>>> method return an object with this information. All methods on
>> the
>>>> new
>>>>>>>>> object
>>>>>>>>>> type would be optionally returning null, if no information is
>>>> available.
>>>>>>>>> But
>>>>>>>>>> when available, we can at least expose it this way.
>>>>>>>>>> 
>>>>>>>>>> The change wouldn't have a major impact, since any project using
>>>>>>>>> MetaModel
>>>>>>>>>> would already need to recompile because of the namespace change
>> to
>>>>>>>>>> org.apache.metamodel. And the change would be compile-time
>>>> compatible
>>>>>>>>> with
>>>>>>>>>> having a void return type.
>>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>>

Re: [DISCUSS] get back update status after invoking executeUpdate(...)

Reply via email to