See some prior blog: http://www.cyanny.com/2014/03/13/hbase-architecture-analysis-part1-logical-architecture/
w.r.t. compaction in Hive, it is used to compact deltas into a base file (in the context of transactions). Likely they're different. Cheers On Fri, Oct 21, 2016 at 9:08 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > Can someone in a nutshell explain *the *Hbase use of log-structured > merge-tree (LSM-tree) as data storage architecture > > The idea of merging smaller files to larger files periodically to reduce > disk seeks, is this similar concept to compaction in HDFS or Hive? > > Thanks > > > Dr Mich Talebzadeh > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd > OABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 21 October 2016 at 15:27, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > > Sorry that should read Hive not Spark here > > > > Say compared to Spark that is basically a SQL layer relying on different > > engines (mr, Tez, Spark) to execute the code > > > > Dr Mich Talebzadeh > > > > > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd > OABUrV8Pw>* > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > > loss, damage or destruction of data or any other property which may arise > > from relying on this email's technical content is explicitly disclaimed. > > The author will in no case be liable for any monetary damages arising > from > > such loss, damage or destruction. > > > > > > > > On 21 October 2016 at 13:17, Ted Yu <yuzhih...@gmail.com> wrote: > > > >> Mich: > >> Here is brief description of hbase architecture: > >> https://hbase.apache.org/book.html#arch.overview > >> > >> You can also get more details from Lars George's or Nick Dimiduk's > books. > >> > >> HBase doesn't support SQL directly. There is no cost based optimization. > >> > >> Cheers > >> > >> > On Oct 21, 2016, at 1:43 AM, Mich Talebzadeh < > mich.talebza...@gmail.com> > >> wrote: > >> > > >> > Hi, > >> > > >> > This is a general question. > >> > > >> > Is Hbase fast because Hbase uses Hash tables and provides random > access, > >> > and it stores the data in indexed HDFS files for faster lookups. > >> > > >> > Say compared to Spark that is basically a SQL layer relying on > different > >> > engines (mr, Tez, Spark) to execute the code (although it has Cost > Base > >> > Optimizer), how Hbase fares, beyond relying on these engines > >> > > >> > Thanks > >> > > >> > > >> > Dr Mich Talebzadeh > >> > > >> > > >> > > >> > LinkedIn * https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJ > >> d6zP6AcPCCdOABUrV8Pw > >> > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrb > >> Jd6zP6AcPCCdOABUrV8Pw>* > >> > > >> > > >> > > >> > http://talebzadehmich.wordpress.com > >> > > >> > > >> > *Disclaimer:* Use it at your own risk. Any and all responsibility for > >> any > >> > loss, damage or destruction of data or any other property which may > >> arise > >> > from relying on this email's technical content is explicitly > disclaimed. > >> > The author will in no case be liable for any monetary damages arising > >> from > >> > such loss, damage or destruction. > >> > > > > >