Re: [DISCUSSION] Display the segment ID when carbondata load is successful

2021-01-17 Thread Indhumathi M
Hi Nihal,

I also feel, it is good to display only the segment Id at the end of
command, similar to
Update Command, which returns number of updated rows. No need to add other
details (It can
be enhanced in Show segments command if needed).

Regards,
Indhumathi M

On Mon, Jan 18, 2021 at 9:54 AM Ajantha Bhat  wrote:

> Hi Nihal,
> In concurrent scenario we cannot map which load command has been loaded as
> which segment id.
> It is good to show the summary at the end of command.
>
>
> I agree with david suggestion.
> Along with load and insert, if possible we should give summary for update,
> delete and merge also (which we may start supporting concurrent operations
> in near future)
>
>
> Thanks,
> Ajantha
>
> On Mon, 18 Jan, 2021, 9:49 am akashrn5,  wrote:
>
> > Hi Nihal,
> >
> > The problem statement is not so clear, basically what is the use case, or
> > in
> > which scenario thee problem is faced. Because we need to get the result
> > from
> > the success segments itself. So please elaborate a little bit about the
> > problem.
> >
> > Also, if you want to include more details, do not include in default show
> > segments, may be can include in show segments with query, which likun had
> > implemented. But this we can decide once its clear.
> >
> > Also, @vikram showing cache here is not a good idea, as we already have a
> > command for that. If you are planning for segments wise, we can improve
> the
> > existing cache specific commands, lets not include here.
> >
> > Thanks,
> >
> > Regards,
> > Akash
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>


Re: [DISCUSSION] Display the segment ID when carbondata load is successful

2021-01-17 Thread Ajantha Bhat
Hi Nihal,
In concurrent scenario we cannot map which load command has been loaded as
which segment id.
It is good to show the summary at the end of command.


I agree with david suggestion.
Along with load and insert, if possible we should give summary for update,
delete and merge also (which we may start supporting concurrent operations
in near future)


Thanks,
Ajantha

On Mon, 18 Jan, 2021, 9:49 am akashrn5,  wrote:

> Hi Nihal,
>
> The problem statement is not so clear, basically what is the use case, or
> in
> which scenario thee problem is faced. Because we need to get the result
> from
> the success segments itself. So please elaborate a little bit about the
> problem.
>
> Also, if you want to include more details, do not include in default show
> segments, may be can include in show segments with query, which likun had
> implemented. But this we can decide once its clear.
>
> Also, @vikram showing cache here is not a good idea, as we already have a
> command for that. If you are planning for segments wise, we can improve the
> existing cache specific commands, lets not include here.
>
> Thanks,
>
> Regards,
> Akash
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>


Re: [DISCUSSION] Display the segment ID when carbondata load is successful

2021-01-17 Thread akashrn5
Hi Nihal,

The problem statement is not so clear, basically what is the use case, or in
which scenario thee problem is faced. Because we need to get the result from
the success segments itself. So please elaborate a little bit about the
problem.

Also, if you want to include more details, do not include in default show
segments, may be can include in show segments with query, which likun had
implemented. But this we can decide once its clear.

Also, @vikram showing cache here is not a good idea, as we already have a
command for that. If you are planning for segments wise, we can improve the
existing cache specific commands, lets not include here.

Thanks,

Regards,
Akash



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: [Discussion]Presto Queries leveraging Secondary Index

2021-01-17 Thread akashrn5
Hi venu,

Thanks for suggesting.

1. option 1 is not a good idea. i think performance will be bad
2. for option2, like we have other indexes of lucene and bloom where the
distributed pruning happens. Lucene also a index stored along with table,
but not another table like SI, so we scan lucene in a distributed job and
then return the index for the filter expression. So similarly we can call
for SI to scan and prune, but since we need spark job to do it, we need
indexserver which is the only option.
So we can use that for scanning, but im afraid if it impacts the other
concurrent queries, so i would suggest better to go for POC with the index
server where we will get to know some other bottlenecks with this approach,
so then we can decide and start design.

If you have already done POC and have some results and design is ready, we
can review that.

Thanks

Regards
Akash



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: [Discussion]Presto Queries leveraging Secondary Index

2021-01-17 Thread David CaiQiang
hi Venu and Ajantha,

For the new SI solution, I have some suggestions also.
1. agree to avoid query plan rewrite
2. push down the SI filter to the pruning step of the main table directly on
the driver side, but we need a distributed job to improve performance
3. segment level usability
   for example, when only one segment doesn't have indexes, but other 99
segments have indexes, SI should be used to improve the filter query of the
index column.
4. consider the filter column's selectivity, it should impact the priority
of the indexes (include main index).
phase 1: base on rules(filter order or hint)
phase 2: base on cost (statistics)





-
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: [DISCUSSION] Display the segment ID when carbondata load is successful

2021-01-17 Thread David CaiQiang
Hi Nihal, my suggestion as following,
1. contain the normal output of the show segment command
2. add more information for loading, like numFiles, numRows, rawDataSize
(maybe show segment need also, take care of CDC which needs to update this
information)



-
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/