Re: refactor query command

Swapnil Bawaskar Wed, 12 Jul 2017 09:31:58 -0700

+1
One suggestion I would like to make is that if the user specifies that the
query results should go to a file, we should not apply the limit clause on
the server.


On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <[email protected]> wrote:

> Basically, our reasoning is client-side pagination is not as useful as
> people would think, you can either get all the results dumped to the
> console, and use scroll bar to move back and forth, or dump it into a file,
> and uses whatever piping mechanism supported by your environment. The
> server side retrieves everything at once anyway and saves the entire result
> set in the backend. It's not like we are saving any server side work here.
>
> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <[email protected]> wrote:
>
>> Currently the way it's implementing the client-side pagination is
>> convoluted and doubtfully useful. We are proposing to get rid of the
>> client-side pagination and only have the server side impose a limit (and
>> maybe implement pagination on the server side later on).
>>
>> The new behavior should look like this:
>>
>> gfsh> set  APP_FETCH_SIZE  50;
>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>
>> Result : true
>> Limit  : 50
>> Rows   : 3
>>
>> Result
>> --------
>> value1
>> value2
>> value3
>>
>>
>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>
>> Result : true
>> Limit  : 50
>> Rows   : 50
>>
>> Result
>> --------
>> value1
>> ...
>> value50
>>
>> gfsh> query --query="select * from /A limit 100"  // suppose entry size
>> is 1000
>> Result : true
>> Rows   : 100
>>
>> Result
>> --------
>> value1
>> ...
>> value100
>>
>>
>> gfsh> query --query="select * from /A limit 500" --file="output.txt"  //
>> suppose entry size is 1000
>> Result : true
>> Rows   : 500
>>
>> Query results output to /var/tempFolder/output.txt
>>
>> (And the output.txt content to be:
>> Result
>> --------
>> value1
>> ...
>> value500)
>>
>>
>> Bear in mind that we are trying to get rid of client side pagination, so
>> the --page-size or --limit option would not apply anymore. Only the limit
>> inside the query will be honored by the server side. If they query does not
>> have a limit clause, the server side will impose a limit (default to 100).
>> The limit can only be explicitly overridden if user chooses to do so. So
>> that user would not accidentally execute a query that would result in a
>> large result set.
>>
>> Would this be sufficient to replace the client-side pagination?
>>
>>
>>
>>
>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <[email protected]>
>> wrote:
>>
>>> To make it clear, gfsh could print the query it sent to server in the
>>> result summary (it shows if it got executed with the limit):
>>> Query     :
>>> Result     : true
>>> startCount : 0
>>> endCount   : 20
>>> Rows       : 1
>>>
>>> -Anil.
>>>
>>>
>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <[email protected]> wrote:
>>>
>>>> I think it might be worth differentiating the result "LIMIT" (as used
>>>> in the OQL query statement like so... "SELECT * FROM /Region WHERE ...
>>>> LIMIT 1000")  from what is actually "streamed" back to *Gfsh* as the
>>>> default (e.g. 100).
>>>>
>>>> Clearly sending all the results back is quite expensive depending on
>>>> the number of results/LIMIT specified.  Therefore, whatever "--option"
>>>> is provided to the `query` command is a further reduction in what is
>>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
>>>> like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region
>>>> WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>
>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>> only be effectively applied to the OQL as it determines what results the
>>>> query actually returns.
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>> [email protected]> wrote:
>>>>
>>>>> >> Actually a really nice thing would be to put the pagination feature
>>>>> into the OQL engine where it belongs.
>>>>> +1 on this.
>>>>>
>>>>> >> if they query mode is interactive, it sends the first 20
>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>> page,
>>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>>> end of the result set), the command finishes.
>>>>>
>>>>> We could provide one more option to end user to quit getting to next
>>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>>
>>>>> I think providing multiple options to view large result set, is a nice
>>>>> feature from tooling perspective (interactive result batching, dumping 
>>>>> into
>>>>> an external file, etc...)
>>>>>
>>>>> >> It’s fairly common in query tooling to be able to set a result set
>>>>> limit.
>>>>> Yes...many of the interactive query tools allow pagination/batching as
>>>>> part of the result display.
>>>>>
>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>> We need to make sure that user can differentiate query commands from
>>>>> options provided by tool.
>>>>>
>>>>> -Anil.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> The way I read this is: One is limiting on the server side, the other
>>>>>> is limiting the client side.  IOW within the query string is acting on
>>>>>> server side.
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> what if user wants to do:
>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>
>>>>>>> What's the difference between put it inside the query string or
>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>>>>> limit.  I would make this a first class option within gfsh instead of 
>>>>>>>> an
>>>>>>>> environment variable.
>>>>>>>>
>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>
>>>>>>>> or
>>>>>>>>
>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>
>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>
>>>>>>>> Anthony
>>>>>>>>
>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting
>>>>>>>> to explore at least a couple output formats, csv being one of the most
>>>>>>>> common for people that wants to import or analyze the data using other
>>>>>>>> tools.
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have 
>>>>>>>>> to
>>>>>>>>> implement pagination.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Mike Stolz
>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 
>>>>>>>>>> and #2 to
>>>>>>>>>> be sufficient for me.
>>>>>>>>>>
>>>>>>>>>> Sarge
>>>>>>>>>>
>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>> >
>>>>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>>>>> command is implemented. The currently implementation is hard to 
>>>>>>>>>> understand
>>>>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>>>>> multiple ways. It's not thread-safe either. This is an internal 
>>>>>>>>>> command
>>>>>>>>>> type, and and only our "query" command uses it.
>>>>>>>>>> >
>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>> configurable) rows,
>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>>>>> result at one.
>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to 
>>>>>>>>>> the next
>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to 
>>>>>>>>>> the end
>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>> >
>>>>>>>>>> > we would like to ask how useful is this interactive feature. Is
>>>>>>>>>> it critical for you? Would the following simplification be 
>>>>>>>>>> sufficient?
>>>>>>>>>> >
>>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>>> make it configurable through environment variables, default to be 
>>>>>>>>>> 100, and
>>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>> >
>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>> dump all the query result in and you can use shell pagination to 
>>>>>>>>>> list the
>>>>>>>>>> content of the file.
>>>>>>>>>> >
>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > Cheers
>>>>>>>>>> >
>>>>>>>>>> > Jinmei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~/William
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Cheers
>>>>>>>
>>>>>>> Jinmei
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~/William
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -John
>>>> john.blum10101 (skype)
>>>>
>>>
>>>
>>
>>
>> --
>> Cheers
>>
>> Jinmei
>>
>
>
>
> --
> Cheers
>
> Jinmei
>

Re: refactor query command

Reply via email to