Re: [DISCUSSION] Display the segment ID when carbondata load is successful
Hi Nihal, I also feel, it is good to display only the segment Id at the end of command, similar to Update Command, which returns number of updated rows. No need to add other details (It can be enhanced in Show segments command if needed). Regards, Indhumathi M On Mon, Jan 18, 2021 at 9:54 AM Ajantha Bhat wrote: > Hi Nihal, > In concurrent scenario we cannot map which load command has been loaded as > which segment id. > It is good to show the summary at the end of command. > > > I agree with david suggestion. > Along with load and insert, if possible we should give summary for update, > delete and merge also (which we may start supporting concurrent operations > in near future) > > > Thanks, > Ajantha > > On Mon, 18 Jan, 2021, 9:49 am akashrn5, wrote: > > > Hi Nihal, > > > > The problem statement is not so clear, basically what is the use case, or > > in > > which scenario thee problem is faced. Because we need to get the result > > from > > the success segments itself. So please elaborate a little bit about the > > problem. > > > > Also, if you want to include more details, do not include in default show > > segments, may be can include in show segments with query, which likun had > > implemented. But this we can decide once its clear. > > > > Also, @vikram showing cache here is not a good idea, as we already have a > > command for that. If you are planning for segments wise, we can improve > the > > existing cache specific commands, lets not include here. > > > > Thanks, > > > > Regards, > > Akash > > > > > > > > -- > > Sent from: > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > >
Re: [DISCUSSION] Display the segment ID when carbondata load is successful
Hi Nihal, In concurrent scenario we cannot map which load command has been loaded as which segment id. It is good to show the summary at the end of command. I agree with david suggestion. Along with load and insert, if possible we should give summary for update, delete and merge also (which we may start supporting concurrent operations in near future) Thanks, Ajantha On Mon, 18 Jan, 2021, 9:49 am akashrn5, wrote: > Hi Nihal, > > The problem statement is not so clear, basically what is the use case, or > in > which scenario thee problem is faced. Because we need to get the result > from > the success segments itself. So please elaborate a little bit about the > problem. > > Also, if you want to include more details, do not include in default show > segments, may be can include in show segments with query, which likun had > implemented. But this we can decide once its clear. > > Also, @vikram showing cache here is not a good idea, as we already have a > command for that. If you are planning for segments wise, we can improve the > existing cache specific commands, lets not include here. > > Thanks, > > Regards, > Akash > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ >
Re: [DISCUSSION] Display the segment ID when carbondata load is successful
Hi Nihal, The problem statement is not so clear, basically what is the use case, or in which scenario thee problem is faced. Because we need to get the result from the success segments itself. So please elaborate a little bit about the problem. Also, if you want to include more details, do not include in default show segments, may be can include in show segments with query, which likun had implemented. But this we can decide once its clear. Also, @vikram showing cache here is not a good idea, as we already have a command for that. If you are planning for segments wise, we can improve the existing cache specific commands, lets not include here. Thanks, Regards, Akash -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: [Discussion]Presto Queries leveraging Secondary Index
Hi venu, Thanks for suggesting. 1. option 1 is not a good idea. i think performance will be bad 2. for option2, like we have other indexes of lucene and bloom where the distributed pruning happens. Lucene also a index stored along with table, but not another table like SI, so we scan lucene in a distributed job and then return the index for the filter expression. So similarly we can call for SI to scan and prune, but since we need spark job to do it, we need indexserver which is the only option. So we can use that for scanning, but im afraid if it impacts the other concurrent queries, so i would suggest better to go for POC with the index server where we will get to know some other bottlenecks with this approach, so then we can decide and start design. If you have already done POC and have some results and design is ready, we can review that. Thanks Regards Akash -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: [Discussion]Presto Queries leveraging Secondary Index
hi Venu and Ajantha, For the new SI solution, I have some suggestions also. 1. agree to avoid query plan rewrite 2. push down the SI filter to the pruning step of the main table directly on the driver side, but we need a distributed job to improve performance 3. segment level usability for example, when only one segment doesn't have indexes, but other 99 segments have indexes, SI should be used to improve the filter query of the index column. 4. consider the filter column's selectivity, it should impact the priority of the indexes (include main index). phase 1: base on rules(filter order or hint) phase 2: base on cost (statistics) - Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: [DISCUSSION] Display the segment ID when carbondata load is successful
Hi Nihal, my suggestion as following, 1. contain the normal output of the show segment command 2. add more information for loading, like numFiles, numRows, rawDataSize (maybe show segment need also, take care of CDC which needs to update this information) - Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/