Introduction to versioned\_collection
=====================================
.. currentmodule:: versioned_collection
``versioned_collection`` is Python library that allows tracking and versioning
MongoDB collections. The data required for versioning is stored in the same
database as the collection to be versioned, this approach having the
advantage of keeping all the data needed for versioning in a single place,
allowing for instance for easier and more intuitive backups or migrations of
versioned collections.
There are two main ways of interacting with a versioned collection:
* using the Python API
* starting to listen to the collection via the command line client and
then updating the documents in any other way, e.g., via
`mongosh `_,
`Compass `_, etc.
What is a VersionedCollection
-----------------------------
A :class:`VersionedCollection` extends the pymongo
`Collection `_
class by adding support for versioning in a way similar to ``git``. A
:class:`VersionedCollection` can be used in the same way a pymongo collection
is used without any overhead in the speed of the execution of the
MongoDB operations. However, this introduces one of the tradeoffs of this
library, making some of the versioning operations more expensive to run
in terms of execution time.
Basic operations and concepts
------------------------------
Knowing the basics of ``git``, learning to use ``versioned_collection`` becomes
trivial since the operations allowed on a versioned collection are a subset of
the operations allowed in ``git``, most of them having a similar semantics.
Here is a table of the ``versioned_collection`` operations and concepts
and their ``git`` correspondent:
.. list-table:: versioned_collection and git operations correspondence
:widths: 10 15 50
:header-rows: 1
* - versioned_collection
- git
- remarks
* - ``register``
- ``commit``
- Registering a `version` of a collection is equivalent to committing
the changes
* - ``checkout``
- ``checkout``
-
* - ``create_branch``
- ``branch``
- Create a new branch. Branches in ``versioned_collection`` are just
pointers to a registered version, as branches in ``git`` are just
pointers to commits.
* - ``stash``
- ``stash``
- Stashes the changes.
* - ``stash apply``
- ``stash apply``
- Applies the stashed changes. The ``versioned_collection`` differs from
the ``git`` one, and overwrites the new state of the collection with the
stashed changes (does not perform a merge).
* - ``stash discard``
- ``stash drop``
- Clears the stashed changes.
* - ``delete_version_subtree``
- ``reset --hard ``
- Removes a version and all the subsequent registered versions.
* - ``discard_changes``
- ``git reset --hard && git clean -fxd``
- Removes all the unregistered changes.
* - ``diff``
- ``diff``
- Computes the `diffs` between the currently checkout version and
another version.
* - ``log``
- ``log``
- Inspect the version log similarly as the commit log can be viewed.
* - ``pull``
- ``pull``
- Pulls the changes from a remote collection to the local collection.
* - ``push``
- ``push``
- Pushes the changes from the local collection to a remote collection.
.. warning::
The syntax of the commands and the available options differs from git,
but the meaning of the concepts is similar.
.. note::
Versioned collection can be seen as ``git`` repositories, but the notion
of remote and local collections (local and remote repositories) is weaker.
For a list of all allowed operations check the
:ref:`Python API documentation ` and the
:ref:`command line client examples `.