Hello,
>I am not getting the concept of pickle and shelves in python, I
>mean what's the use of both the concepts, when to use them in code
>instead of using file read and write operations.
>
>Could anyone please explain me the concepts.
I see you already have two answers (from David Rock and Alan
Gauld).
I will add a slightly different answer and try also to explain some
of the history (at a very high level).
* Programs need to take input from "somewhere".
* Programs need data structures in memory on which to operate.
There are many different ways to store data "somewhere" and in order
to create the data structures in memory (on which your program will
operate), you need to have code that knows how to read that data
from "somewhere".
So, there's data that has been written to disk. I often call this
the serialized form form of the data [0]. There different usages
for serialization, but ones I'll talk about are the serialized
formats that we typically write to disk to store data for a program
to read. Here are a few such serialized formats:
* JSON or XML (homage: SGML)
* pickles and shelves
* GNU dbm, ndbm, cdb, rocksdb and probably 1 million others
* custom binary formats
* plain text files (be careful with your encoding...)
Digression: You might ask... what about SQL? Technically, the
serialization is something that the SQL database software takes care
of and your application doesn't. So no need to know about the
serialized format. This can be freeing, at some complexity. But,
back to your question.
Every one of the serialized formats comes with some advantages and
some disadvantages. Some are easy. Some are flexible. Other
formats are structured with bindings in many languages. Some
are tied closely to a single language or even specific language
versions. Some formats are even defined by a single application or
program that somebody has written.
What about pickle and shelve? Where do they fit?
Both pickle and shelve are well-maintained and older Python-specific
formats that allow you to serialize Python objects and data
structures to disk. This is extremely convenient if you are
unlikely to change Python versions or to change your data
structures. Need your program to "remember" something from a prior
run? When it starts up, it can read a ./state.pickle straight into
memory, pick up where it left off and perform some operation, and
then, when complete, save the dat astructure back to ./state (or
more safely to a new file ./state.$timestamp) and exit.
This is a convenient way to store Python objects and data
structures.
Advantage: Native Python. Dead simple to use (you still have to be
careful about file-writing logic, overwriting old files can be
bad, but it's a bit up to you). You can dump many Python data
structures and objects to disk.
Disadvantages: Files are only be readable by Python (excluding
motivated implementers in other languages).
If you would like to use pickle or shelve, please ask again on this
list for specific advice on these. The shelve module is intended to
make it easy to have a data structure in memory that is backed by a
data file on disk. This is very similar to what the dbm module also
offers.
The pickle module is more geared toward loading an entire data
structure from the disk into memory.
There are other options, that have been used for decades (see below
my sig for an incomplete and light-hearted history of serialization
in the digital world).
The option for serialization formats and accessing them from Python
are many. Pickle and shelve are very Python specific, but will be
very easy to use and will be more forgiving if you happen to try to
store some code as well as "pure" data.
If you are going to need to exchange data with other programs,
consider JSON. Reading and writing to JSON format is as easy as
reading and writing to a shelve (which is a Python pickle format
under the hood). Here's a two liner that will take the environment
of a running program and dump that into a human- and
machine-readable JSON format. Step A:
import os, sys, json
json.dump(dict(os.environ), sys.stdout, indent=4, sort_keys=True)
Now, let's say that you want to read that in another program (and
I'll demonstrate just dumping the in-memory representation to your
terminal). Step B:
import sys, json, pprint
pprint.pprint(json.load(sys.stdin))
So, going back to your original question.
>I am not getting the concept of pickle and shelves in python, I
>mean what's the use of both the concepts, when to use them in code
>instead of using file read and write operations.
You can, of course, use file read / write operations whenever you
need to load data into memory from the disk (or "somewhere") or to
write data from memory into the disk (or "somewhere").
The idea behind tools and libraries like...
pickle and shelve (which are Python-specific),
JSON (flexible and used by