On Mar 10, 2007, at 3:27 PM, Michael Busch wrote:

- Introduce index-level metadata. Preferable in XML format, so it will be human readable. Later on, we can store information about the index format in this file, like the codecs that are used to store the data.

To provoke thought about what index-level metadata might go in this file, the contents of a KS "segments_2.yaml" file immediately after indexing an html presentation of the US constitution is below.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


slothbear:~/projects/ks/perl marvin$ cat uscon_invindex/segments_2.yaml
ks_version: 0.20_02
fields:
  title: 'KinoSearch::Schema::FieldSpec'
  url: 'USConSchema::UnIndexedField'
  content: 'KinoSearch::Schema::FieldSpec'
format: 1
generation: 2
seg_counter: 1
segments:
  _1:
    term_list_index:
      skip_interval: 16
      format: 1
      index_interval: 128
      size: 8
      counts:
        title: 1
        content: 8
    posting_list:
      format: 1
    compound_file:
      format: 1
      sub_files:
        _1.tlx2:
          offset: 138575
          length: 93
        _1.p0:
          offset: 138134
          length: 441
        _1.tvx:
          offset: 137718
          length: 416
        _1.tv:
          offset: 73487
          length: 64231
        _1.tl0:
          offset: 73259
          length: 228
        _1.p2:
          offset: 56393
          length: 16866
        _1.ds:
          offset: 7015
          length: 49378
        _1.tl2:
          offset: 421
          length: 6594
        _1.dsx:
          offset: 5
          length: 416
        _1.tlx0:
          offset: 0
          length: 5
    term_vectors:
      format: 1
    term_list:
      skip_interval: 16
      format: 1
      index_interval: 128
      size: 923
      counts:
        title: 41
        content: 923
    doc_storage:
      format: 1
    seg_info:
      seg_name: _1
      doc_count: 52
      field_names:
        - title
        - url
        - content
version: 1173732193033



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to