Well, updates (in memory) would ultimately be flushed to disk, resulting in new hfiles.
On Fri, Oct 21, 2016 at 1:50 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > thanks > > bq. all updates are done in memory o disk access > > I meant data updates are operated in memory, no disk access. > > in other much like rdbms read data into memory and update it there > (assuming that data is not already in memory?) > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd > OABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 21 October 2016 at 21:46, Ted Yu <yuzhih...@gmail.com> wrote: > > > bq. this search is carried out through map-reduce on region servers? > > > > No map-reduce. region server uses its own thread(s). > > > > bq. all updates are done in memory o disk access > > > > Can you clarify ? There seems to be some missing letters. > > > > On Fri, Oct 21, 2016 at 1:43 PM, Mich Talebzadeh < > > mich.talebza...@gmail.com> > > wrote: > > > > > thanks > > > > > > having read the docs it appears to me that the main reason of hbase > being > > > faster is: > > > > > > > > > 1. it behaves like an rdbms like oracle tetc. reads are looked for > in > > > the buffer cache for consistent reads and if not found then store > > files > > > on > > > disks are searched. Does this mean that this search is carried out > > > through > > > map-reduce on region servers? > > > 2. when the data is written it is written to log file sequentially > > > first, then to in-memory store, sorted like b-tree of rdbms and then > > > flushed to disk. this is exactly what checkpoint in an rdbms does > > > 3. one can point out that hbase is faster because log structured > merge > > > tree (LSM-trees) has less depth than a B-tree in rdbms. > > > 4. all updates are done in memory o disk access > > > 5. in summary LSM-trees reduce disk access when data is read from > disk > > > because of reduced seek time again less depth to get data with > > LSM-tree > > > > > > > > > appreciate any comments > > > > > > > > > cheers > > > > > > > > > Dr Mich Talebzadeh > > > > > > > > > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > > > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > <https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJd6zP6AcPCCd > > > OABUrV8Pw>* > > > > > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for > any > > > loss, damage or destruction of data or any other property which may > arise > > > from relying on this email's technical content is explicitly > disclaimed. > > > The author will in no case be liable for any monetary damages arising > > from > > > such loss, damage or destruction. > > > > > > > > > > > > On 21 October 2016 at 17:51, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > See some prior blog: > > > > > > > > http://www.cyanny.com/2014/03/13/hbase-architecture- > > > > analysis-part1-logical-architecture/ > > > > > > > > w.r.t. compaction in Hive, it is used to compact deltas into a base > > file > > > > (in the context of transactions). Likely they're different. > > > > > > > > Cheers > > > > > > > > On Fri, Oct 21, 2016 at 9:08 AM, Mich Talebzadeh < > > > > mich.talebza...@gmail.com> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > Can someone in a nutshell explain *the *Hbase use of log-structured > > > > > merge-tree (LSM-tree) as data storage architecture > > > > > > > > > > The idea of merging smaller files to larger files periodically to > > > reduce > > > > > disk seeks, is this similar concept to compaction in HDFS or Hive? > > > > > > > > > > Thanks > > > > > > > > > > > > > > > Dr Mich Talebzadeh > > > > > > > > > > > > > > > > > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > > > > > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > > <https://www.linkedin.com/profile/view?id= > > > AAEAAAAWh2gBxianrbJd6zP6AcPCCd > > > > > OABUrV8Pw>* > > > > > > > > > > > > > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > > > > > > > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility > for > > > any > > > > > loss, damage or destruction of data or any other property which may > > > arise > > > > > from relying on this email's technical content is explicitly > > > disclaimed. > > > > > The author will in no case be liable for any monetary damages > arising > > > > from > > > > > such loss, damage or destruction. > > > > > > > > > > > > > > > > > > > > On 21 October 2016 at 15:27, Mich Talebzadeh < > > > mich.talebza...@gmail.com> > > > > > wrote: > > > > > > > > > > > Sorry that should read Hive not Spark here > > > > > > > > > > > > Say compared to Spark that is basically a SQL layer relying on > > > > different > > > > > > engines (mr, Tez, Spark) to execute the code > > > > > > > > > > > > Dr Mich Talebzadeh > > > > > > > > > > > > > > > > > > > > > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > > > > > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > > > <https://www.linkedin.com/profile/view?id= > > > > AAEAAAAWh2gBxianrbJd6zP6AcPCCd > > > > > OABUrV8Pw>* > > > > > > > > > > > > > > > > > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > > > > > > > > > > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility > > for > > > > any > > > > > > loss, damage or destruction of data or any other property which > may > > > > arise > > > > > > from relying on this email's technical content is explicitly > > > > disclaimed. > > > > > > The author will in no case be liable for any monetary damages > > arising > > > > > from > > > > > > such loss, damage or destruction. > > > > > > > > > > > > > > > > > > > > > > > > On 21 October 2016 at 13:17, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > > > > > >> Mich: > > > > > >> Here is brief description of hbase architecture: > > > > > >> https://hbase.apache.org/book.html#arch.overview > > > > > >> > > > > > >> You can also get more details from Lars George's or Nick > Dimiduk's > > > > > books. > > > > > >> > > > > > >> HBase doesn't support SQL directly. There is no cost based > > > > optimization. > > > > > >> > > > > > >> Cheers > > > > > >> > > > > > >> > On Oct 21, 2016, at 1:43 AM, Mich Talebzadeh < > > > > > mich.talebza...@gmail.com> > > > > > >> wrote: > > > > > >> > > > > > > >> > Hi, > > > > > >> > > > > > > >> > This is a general question. > > > > > >> > > > > > > >> > Is Hbase fast because Hbase uses Hash tables and provides > random > > > > > access, > > > > > >> > and it stores the data in indexed HDFS files for faster > lookups. > > > > > >> > > > > > > >> > Say compared to Spark that is basically a SQL layer relying on > > > > > different > > > > > >> > engines (mr, Tez, Spark) to execute the code (although it has > > Cost > > > > > Base > > > > > >> > Optimizer), how Hbase fares, beyond relying on these engines > > > > > >> > > > > > > >> > Thanks > > > > > >> > > > > > > >> > > > > > > >> > Dr Mich Talebzadeh > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > LinkedIn * https://www.linkedin.com/profile/view?id= > > > > > AAEAAAAWh2gBxianrbJ > > > > > >> d6zP6AcPCCdOABUrV8Pw > > > > > >> > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrb > > > > > >> Jd6zP6AcPCCdOABUrV8Pw>* > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > http://talebzadehmich.wordpress.com > > > > > >> > > > > > > >> > > > > > > >> > *Disclaimer:* Use it at your own risk. Any and all > > responsibility > > > > for > > > > > >> any > > > > > >> > loss, damage or destruction of data or any other property > which > > > may > > > > > >> arise > > > > > >> > from relying on this email's technical content is explicitly > > > > > disclaimed. > > > > > >> > The author will in no case be liable for any monetary damages > > > > arising > > > > > >> from > > > > > >> > such loss, damage or destruction. > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > >