Hi Community, As Carbon is closely integrated with spark, insert operations in carbon are done using spark API. This in turn fires spark jobs, which adds various overhead like task serialisation cost, extra memory consumption, execution time in remote nodes, shuffle etc.
In case of simple insert operations - we can improve the performance by reusing SDK (which is plain java code) to achieve the same, thereby cutting off the overheads discussed above. Following is the link to the design document. Please give your valuable comments/inputs/suggestions. https://docs.google.com/document/d/1BcbTcO__vZbLLuhU73NIcbJOM2FRcKBa-ZxackofAS0/edit?usp=sharing Thanks, Regards, N Akshay Kumar -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/