Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

Aniket Adnaik Wed, 29 Mar 2017 18:45:58 -0700

Hi Jacky,

Thanks for your comments. I guess i should have uploaded in google doc
format instead of pdf, somehow google doc messes up all the diagrams if I
copy paste and i have not figured the way to fix it. Anyway, I apologize
for the inconvenience for those wanted to add in-line comments in the
document. For now, I will try to address your questions through email below
and see if I can reload comment-able version in google docs.


Context : The table status file in metadata folder will be used to indicate
the status of streaming ingestion, such as in-progress or
successful.
Jacky>> Does this mean status file need to be rewritten for every mini
batch?
AA>> currently Yes, this may not so efficient, we can think of keeping this
info into metastore. Let me know if you have any ideas.

Context: Only one writer (executor) is allowed to write into streaming data
file.
Jacky>> Every writer write to one file, but there could be parallel writer
write to different file, right?
AA>> Yes, there could be parallel executor writing to a different file.

Context: BlockHeader
Jacky>> In BlockletHeader, I think two more fields need to add:
i64 blocklet_offset and list<DataChunk3> datachunks ; DataChunk3 contains
the min/max of each column in each page mutation, blockletindex,
blockletinfo, dictionary these are not required
AA>> Yes, this probably needs more refinement and discussion. I was
thinking more in the lines of using existing V3 format , than adding a new
one.

Context: Approach-2 for file format
Jacky>> Is this metadata file appendable? It should not have Footer then
And how to maintain the locality of this file and the stream file together?
AA>> yes metdata file will be appendable. Footer will added when file is
complete. Ideally co-location with base streaming file will be the best
case, not sure if HDFS data placement policy provides any configuration.

Context: Write Flow diagram
Jacky>> 1.In structured streaming, does not the executor receive event
directly from streaming source?
AA>> Yes , After receiver is setup , driver will have StreamingQueryListner
to communicate with executors. I will add arrows from source to executors
to be more clear.
2.Is the metadata protected by some lock? How 2 executor write to it
simultaniencely?
AA>> yes, metadata will to be protected by lock. Again, need to explore
more efficient way if there is one.

Context: Read Consistency
Jacky>> I think more need to analyze here, how about a query consist of 2
scan operations in different stages?
AA>> Need to check on that. My assumption is that we have only one query
start time-stamp that which can be utilized.

Context: Compaction
Jacky>>Can we have some policy so that user does not need to manually
trigger it?
AA>> Yes, this needs to be configurable based on number of streaming files.

Context: Spark Structured streaming info/ background , "No aggregation
supported"
Jacky>> you mean no aggregate query is allowed?
AA>> This limit is on the writer side, means spark writeStream with file
sink for parquet does not support performing aggregation before writing to
file sink. Once the data is written, it should be able with aggregated
query.

Best Regards,
Aniket

On Wed, Mar 29, 2017 at 8:46 AM, Jacky Li <jacky.li...@qq.com> wrote:

> Hi Aniket,
>
> Comment inline
> And I have put some review comment in the PDF here:
> https://drive.google.com/file/d/0B5vjWGChUwXdSUV0OTFkTGE4am8/
> view?usp=sharing <https://drive.google.com/file/d/
> 0B5vjWGChUwXdSUV0OTFkTGE4am8/view?usp=sharing>
>
> > 在 2017年3月29日，上午7:10，Aniket Adnaik <aniket.adn...@gmail.com> 写道：
> >
> > Hi Jacky,
> >
> > Please see my comments below;
> > 1. In this phase, is it still using columnar format? Save to a file for
> > every mini batch? If so, it is only readable after the file has been
> closed
> > and some metadata need to be kept to indicate the availability of the new
> > file.
> >
> > AA >> yes, for initial phase it will use default columnar format and save
> > to file every mini batch. Closing of file may not be needed, as HDFS
> allows
> > single writer-multiple readers. But yes it will require us to maintain a
> > streaming_status file to let readers know about valid timestamp and
> offsets
> > during getsplits.
> >
> > 2. How to map the partition concept in spark to files in streaming
> segment?
> > I guess some small file will be created, right?
> >
> > AA>> In streaming context, writeStream.partitionBy() may require
> CarbonData
> > to create separate folder for each partition.
> > Folder may look like  \TableName\_Fact\part0\StreamingSegment\
> > *partition_0\streamingfile.001*
> > However, I am not sure how carbondata will utilize this partition info as
> > my assumption is currently CarbonData does not support
> partitioning.Also, I
> > am not sure if existing table with no partitioning schema can work well.
> > This needs further analysis.
> >
>
> Currently carbon does not support partition yet, but we do have future
> plan for partitioning, for the bulkload scenario. The draft idea is to add
> a partition step before the input step in current loading pipeline
> framework. And the folder structure may look like: 
> \TableName\Fact\part0\segment0.
> I will describe it in another thread. It think user can use the same
> partition key for both bulkload and streaming ingest.
>
>
> > 3. Phase-2 : Add append support if not done in phase 1. Maintain append
> offsets
> > and metadata information.
> > Is the streaming data file format implemented in this phase?
> > AA>>  I think we can directly leverage from existing V3 format without
> much
> > changes in basic writer/reader framework, in that case implementing
> > streaming file format is a possibility.
> >
> > Best Regards,
> > Aniket
> >
> > On Tue, Mar 28, 2017 at 8:22 AM, Jacky Li <jacky.li...@qq.com> wrote:
> >
> >> Hi Aniket,
> >>
> >> This feature looks great, the overall plan also seems fine to me. Thanks
> >> for proposing it.
> >> And I have some doubts inline.
> >>
> >>> 在 2017年3月27日，下午6:34，Aniket Adnaik <aniket.adn...@gmail.com> 写道：
> >>>
> >>> Hi All,
> >>>
> >>> I would like to open up a discussion for new feature to support
> streaming
> >>> ingestion in CarbonData.
> >>>
> >>> Please refer to design document(draft) in the link below.
> >>>     https://drive.google.com/file/d/0B71_EuXTdDi8MlFDU2tqZU9BZ3M
> >>> /view?usp=sharing
> >>>
> >>> Your comments/suggestions are welcome.
> >>> Here are some high level points.
> >>>
> >>> Rationale:
> >>> The current ways of adding user data to CarbonData table is via LOAD
> >>> statement or using SELECT query with INSERT INTO statement. These
> methods
> >>> add bulk of data into CarbonData table into a new segment. Basically,
> it
> >> is
> >>> a batch insertion for a bulk of data. However, with increasing demand
> of
> >>> real time data analytics with streaming frameworks, CarbonData needs a
> >> way
> >>> to insert streaming data continuously into CarbonData table. CarbonData
> >>> needs a support for continuous and faster ingestion into CarbonData
> table
> >>> and make it available for querying.
> >>>
> >>> CarbonData can leverage from our newly introduced V3 format to append
> >>> streaming data to existing carbon table.
> >>>
> >>>
> >>> Requirements:
> >>>
> >>> Following are some high level requirements;
> >>> 1.  CarbonData shall create a new segment (Streaming Segment) for each
> >>> streaming session. Concurrent streaming ingestion into same table will
> >>> create separate streaming segments.
> >>>
> >>> 2.  CarbonData shall use write optimized format (instead of
> multi-layered
> >>> indexed columnar format) to support ingestion of streaming data into a
> >>> CarbonData table.
> >>>
> >>> 3.  CarbonData shall create streaming segment folder and open a
> streaming
> >>> data file in append mode to write data. CarbonData should avoid
> creating
> >>> multiple small files by appending to an existing file.
> >>>
> >>> 4.  The data stored in new streaming segment shall be available for
> query
> >>> after it is written to the disk (hflush/hsync). In other words,
> >> CarbonData
> >>> Readers should be able to query the data in streaming segment written
> so
> >>> far.
> >>>
> >>> 5.  CarbonData should acknowledge the write operation status back to
> >> output
> >>> sink/upper layer streaming engine so that in the case of write failure,
> >>> streaming engine should restart the operation and maintain exactly once
> >>> delivery semantics.
> >>>
> >>> 6.  CarbonData Compaction process shall support compacting data from
> >>> write-optimized streaming segment to regular read optimized columnar
> >>> CarbonData format.
> >>>
> >>> 7.  CarbonData readers should maintain the read consistency by means of
> >>> using timestamp.
> >>>
> >>> 8.  Maintain durability - in case of write failure, CarbonData should
> be
> >>> able recover to latest commit status. This may require maintaining
> source
> >>> and destination offsets of last commits in a metadata.
> >>>
> >>> This feature can be done in phases;
> >>>
> >>> Phase -1 : Add basic framework and writer support to allow Spark
> >> Structured
> >>> streaming into CarbonData . This phase may or may not have append
> >> support.
> >>> Add reader support to read streaming data files.
> >>>
> >> 1. In this phase, is it still using columnar format? Save to a file for
> >> every mini batch? If so, it is only readable after the file has been
> closed
> >> and some metadata need to be kept to indicate the availability of the
> new
> >> file.
> >> 2. How to map the partition concept in spark to files in streaming
> >> segment? I guess some small file will be created, right?
> >>
> >>> Phase-2 : Add append support if not done in phase 1. Maintain append
> >>> offsets and metadata information.
> >>>
> >> Is the streaming data file format implemented in this phase?
> >>
> >>> Phase -3 : Add support for external streaming frameworks such as Kafka
> >>> streaming using spark structured steaming, maintain
> >>> topics/partitions/offsets and support fault tolerance .
> >>>
> >>> Phase-4 : Add support to other streaming frameworks , such as flink ,
> >> beam
> >>> etc.
> >>>
> >>> Phase-5: Future support for in-memory cache for buffering streaming
> data,
> >>> support for union with Spark Structured streaming to serve directly
> from
> >>> spark structured streaming.  And add support for Time series data.
> >>>
> >>> Best Regards,
> >>> Aniket
> >>>
> >>
> >>
> >>
> >>
>
>

Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

Reply via email to