I use https://github.com/dmkoch/django-jsonfield for storing metadata and
works very well, the only problem that I see in using one jsonfield for all
the metadata you have for a document is searching (which I don't need).

3k, 6k or 9k of rows is "nothing" for MySQL or Postgres, a bottleneck won't
be there. The problem may be implementing searching, you can keep the data
in memory and implement the searching in Python in the case of "regular db
search" is not enough.

HTH

On Sun, Dec 17, 2017 at 2:28 AM, Dan Tagg <dant...@wildmanherring.com>
wrote:

> If you switched to Postgres you could use a json data type or dictionary
> data type. Django 1.11 onwards supports the storage and retrieval of both
> types I believe.
>
> Dan
>
> On Sun, 17 Dec 2017 at 07:46, Mark Phillips <m...@phillipsmarketing.biz>
> wrote:
>
>> I am building an electronic document management system using django and
>> mysql. The purpose of this system is to catalog and retrieve electronic
>> documents. There is no inventory of the physical documents. There are about
>> 3,000 documents (photos, pdfs, videos) now, and that number could double in
>> the next year. The purpose of the site is to search for a set of documents
>> that share some given set of metadata values. I am wrestling with the model
>> design for the metadata part of this project.
>>
>> I have identified about 30 internal metadata types particular to these
>> documents, and not related to any standards. If I use Dublin Core, there
>> are 71 types, but I would not need all of them, and there is some overlap
>> with the 30 internal metadata types - say another 25 types. Schema.org also
>> has an interesting set of metadata tags, and some may be applicable to this
>> project, and using that standard as well, could bring the number of types
>> to 75.
>>
>> As I see it, I have three options.
>>
>> Option 1 - traditional database design, with metadata types as columns,
>> and values in rows. The document model would have approx 50 - 80 fields,
>> and one could expect out of 3,000 documents about 50-70% of the values are
>> Null.
>>
>> This option is the most straight forward to create. I have no idea what
>> the performance would be for django and mysql, searching for a set of
>> documents in a model with 50-80 fields and 3,000 entries, where 60% are
>> Null. Is this a feasible design, or do I need to think of another way to
>> model this metadata? Will this scale to 6,000 documents? What if I need to
>> add another metadata standard to the site in the future? (I am not looking
>> for job security in this project)
>>
>> Option 2 - some sort of EAV design, with a table that has document_id,
>> metadatatype_id, and metadatavalue_id. I have read all sorts of posts about
>> the evils of EAV, but they seem to have peaked about 8 years ago, and now I
>> am seeing some posts that even recommend this design approach.
>>
>> Is this approach a better fit for my particular application? Will it
>> scale to the numbers of documents and metadata I am thinking about? I
>> really don't want to go to the dark side if I can help it.
>>
>> Option 3 - A hybrid solution that I have also read about. Put the more
>> common metadata types as fields in the model (columns in the database
>> table), and then JSON-ize the rest of the metadata for that document in a
>> string field in the model. This seems to put a rather large burden on the
>> application code when searching for a set of documents with a particular
>> set of metadata values.
>>
>> The performance cost of this method seems to depend heavily on the
>> distribution of metadata types in the entire document set, and being smart
>> enough to pick the best common ones. If there were a set of metadata types
>> that span 90% of the the documents, then this would probably be a good
>> solution. Span only 25% of the documents, and there is a lot of comparing
>> values in the JSON string.
>>
>> Thanks for any insights/recommendations you can provide!
>>
>> Mark
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>>
>> You received this message because you are subscribed to the Google Groups
>> "Django users" group.
>>
>>
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to django-users+unsubscr...@googlegroups.com.
>>
>>
>> To post to this group, send email to django-users@googlegroups.com.
>>
>>
>> Visit this group at https://groups.google.com/group/django-users.
>>
>>
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/django-users/CAEqej2MYmfqaA1Gq7nByjGo%3DcxF6_x7fbz4cuRy2EgrCBmuDVA%
>> 40mail.gmail.com
>> <https://groups.google.com/d/msgid/django-users/CAEqej2MYmfqaA1Gq7nByjGo%3DcxF6_x7fbz4cuRy2EgrCBmuDVA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-users+unsubscr...@googlegroups.com.
> To post to this group, send email to django-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/django-users/CAPZHCY7H1Go%2BOVzYfdToF_JrV3Wo783S3c%3Dk_
> eWztj6F0twQng%40mail.gmail.com
> <https://groups.google.com/d/msgid/django-users/CAPZHCY7H1Go%2BOVzYfdToF_JrV3Wo783S3c%3Dk_eWztj6F0twQng%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CA%2BFDnhJE2d4D93h7SHzt%3DmeeTdi5OB%2B9eKUvk4APXacYJ%3DmADA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to