Thanks Wilm,
Let me try to explain my scenario in more detail. Let me talk about two
specific entities, Jobs and Sources.
*Source- *A URL that is source of some data. It also contains other
meta-info like description, type etc. So, the required columns are,
source_name, url, description, type.
*Job- *An independent entity created with data from the selected sources.
Apart from job information, we need to keep a track of which sources were
selected for this job, and this list is editable, hence addition/removal are
possible. The columns needed in job are, job_name, description,
source_{source-rowkey} and so on.
I was considering following options,
1. Create a JSON of each source and dump it into the value field of
source_{timestamp} column. But I need to be able to list all of the
available sources before creating a job. This would mean scanning all jobs
and finding just the unique sources from the all the lists. This seems like
an overkill.
Another problem with this approach is that I would have to write my own
custom filters if I need to filter jobs on basis of source.
2. Create a new table for sources and keep the rowkeys of the sources in job
rows. This turns out to be somewhat like foreign keys thoguh which
understandably is awkward for HBase. But now I have the option of scanning
the sources table for listing purposes.
And this is where my question originated. When I need to fetch sources for a
particular job I could just filter them based on job key column from source
table. This would mean a long scan on all rows of the source table.
Another option is, to fetch the list of source rowkeys from job row and then
directly hit the source table for these specific rowkeys.
If this option sustains, which of the above methods if more prudent.
This example might not seem to be based on huge data but I do expect
millions of jobs to be created. Also, this is a common pattern which I need
to implement in other parts of HBase tables too.
Thanks,
Jatin
--
View this message in context:
http://apache-hbase.679495.n3.nabble.com/HBase-entity-relationship-tp4066296p4066327.html
Sent from the HBase User mailing list archive at Nabble.com.