Re: The size of the tablestatus file is getting larger, does it impact the performance of reading this file?

2018-03-14 Thread manish gupta
I think maintaining a tablestatus backlog file is a good idea. This will
also help us in quick filtering of valid segments as the number of segments
increase during queries execution which involve reading of table status
file.

Show segment DDL can read both the files to display the output.

Regards
Manish Gupta

On Thu, 15 Mar 2018 at 10:19 AM, xm_zzc <441586...@qq.com> wrote:

> Hi Jacky, Raghunandan S:
>   Thanks for your reply.
>   Currently I am working on PR2045, this pr will automatically delete the
> segment lock files when execute method
> 'SegmentStatusManager.deleteLoadsAndUpdateMetadata', and it will scan
> 'tablestatus' file to decide which segment lock file need to be deleted.
> Ravindra Pesala considers the performance  of reading tablestatus file as
> the size of it is getting larger. So I want to know whether it can reduce
> the size of tablestatus file.
>   According to Raghunandan S's suggestion, I think we can *append* the
> invisible segment list to the file called 'tablestatus.history' when
> execute
> command 'CLEAN FILES FOR TABLE' every time, separate  visible and invisible
> segments into two files. If later it needs to support listing all
> segments(include visible and invisible) list when execute 'SHOW SEGMENTS
> FOR
> TABLE', it just need to read from two files. Is it OK to do so?
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>


Re: The size of the tablestatus file is getting larger, does it impact the performance of reading this file?

2018-03-14 Thread Raghunandan S
Dear Jacky,
It was purposefully done like that.the table status need to give the
history of the transactions that happened on the system.This is like an
audit point.

Dear xm_zzc
what is your use case?

In any case we cannot permanently remove the entries from our system.based
on use case we can consider to move it to a separate file.we can also check
what the size would be and optimising reading it from multiple places.

Regards
Raghu
On Wed, 14 Mar 2018 at 12:18 PM, Jacky Li  wrote:

> Hi,
>
> Yes, I think you are right. Currently CLEAN FILES command only delete the
> segment data folder, but not deleting metadata entries in table_status
> file, I think this is the problem.
> Please feel free to open a JIRA ticket and improve it. Thanks.
>
> Regards,
> Jacky
>
> > 在 2018年3月14日,上午10:28,xm_zzc <441586...@qq.com> 写道:
> >
> > Hi dev:
> >  The size of the tablestatus file is getting larger, does it impact the
> > performance of reading this file, for example 1 million segment info in
> this
> > file? There are many places will scan this file.
> >  Why not delete the invisible segment info to reduce the size of
> > tablestatus file? will they be used later?
> >
> >
> > --
> > Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
>
>
>


Re: [Discussion] About syntax of compaction on specified segments

2018-03-14 Thread Liang Chen
Hi

Thank jinzhou started this discussion session.

I also propose to use the proposed solution from manish, not impacts the
current Major and Minor compaction behaviors.

Regards
Liang

manishgupta88 wrote
> Hi,
> 
> I agree with @gvramana https://github.com/gvramana;
> 
>1. We should *not use* Major/Minor compaction type as they have a
>specific meaning and both are controlled by the system for taking
> decisions
>whether segment is valid to be compacted or not.
>2. We should *not use* carbon.input.segments.default.seg_compact to set
>the segments to be compacted.
>3. We should introduce a new compaction type in the DDL 'CUSTOM' as
>suggested by @gvramana https://github.com/gvramana; because it
> is
>something like force compaction for the given segments as it will not
> check
>for size and frequency of segments. We can work on using the below
> syntax
>for custom compaction.
> 
> *ALTER TABLE [db_name.]table_name COMPACT 'CUSTOM' WHERE SEGMENT.ID
> http://SEGMENT.ID; IN (0,5,8)*
> 
> Once a table is compacted using Custom compaction, then minor compaction
> does not hold good for the custom compacted segment. Custom compacted
> segment should only participate during major compaction if it satisfies
> the
> major compaction size property.
> 
> Regards
> Manish Gupta
> 
> On Tue, Mar 13, 2018 at 2:55 PM, luffy 

> luffy.wang@

>  wrote:
> 
>> compaction have major and minor is ok,not need another like custom,i am
>> more
>> concerned about compaction performance.
>>
>>
>>
>> --
>> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
>> n5.nabble.com/
>>





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/