This has been covered before on other sites (and to some extent on ours), but since I recently explained all of this to someone on Twitter I thought I’d take what I wrote then and massage it into some form of post for our blog.
To begin with we need to cover a few terms. One is that, as I mentioned in our Should I Get Lightroom or Photoshop or…? post, a digital image isn’t really an ‘image’ in the traditional sense. It starts as light, gets converted into electrical impulses by the camera’s sensor, and is then translated into binary code – 1s and 0s – as a digital file. Be that as it may, there are two aspects to each digital image. One is the image information itself – the code that is assembled to create the image on your computer screen or as a print, and the other is information about the image. This starts in the camera and can include the camera make/model/serial number, the exposure information, date and time of the image capture, the lens, focal length (for zoom lenses), GPS information and more. When the image is transferred to the computer one can add copyright information, keywords, owner contact information… All of this is collectively called ‘metadata’. This metadata is stored in one of two ways, depending on the type of digital image you’re working with. For .TIF, .PSD, .JPG and .DNG files, the metadata is stored within the image file itself. For raw images, a second file, often called a ‘sidecar’ file is generated. Sidecar files have an .XMP extension. Therefore, for a Canon raw file for example, one would have IMG0001.cr2 for the image data, and IMG0001.xmp for the metadata information.
The second thing to understand is that while programs like Lightroom will do raw file conversion, printing, slideshows, web galleries and more, Lightroom is primarily a database management system (DBMS). A database is an organized collection of information, and a DBMS is a program that allows someone to work with and manipulate this information. To me, this is the most important part of the Lightroom program, because it doesn’t matter whether you have 1000 images or 100,000 images, eventually you’re going to be looking for one specific image. How easy (or not) it is to find depends on how well you’ve managed your data. In general terms this is called digital asset management (DAM) (with thanks to Wikipedia!)
The Lightroom Catalog(s) that you have are the database system. There are pros and cons to using one or more than one Lightroom catalog; Julieanne Kost discusses some of them here. For most people, having all of your images in one catalog offers a lot of advantages but the choice is ultimately yours.
Good file structure is also important; how you organize your files is up to you, as long as you make it consistent. If you’re interested you can read about My Lightroom Workflow. Just to be clear, we’re now dealing with 3 separate entities – the image file, the metadata (which may be contained within the image file or in a separate ‘sidecar’ file), and the database system (the Lightroom catalog).
The first step in working with Lightroom is to ‘Import’ one’s images into the Catalog. ‘Import’ is a bit of a misnomer because Lightroom does not store your images in the catalog. What it does is create a line (record) in the catalog for each image that says, ‘This image is located at this location’. That location may be on an internal hard drive, on an external hard drive, a network drive, or even at an online location. Lightroom doesn’t really care where you store your images, although there are certain advantages to having all of the images stored within one parent folder. When you ‘import’ an image or series of images into Lightroom, the program does several things:
- It will move/copy the image(s) to the specified location, or add a link to them at their current location. In so doing it creates a pointer so that it knows where the image is located.
- It will read the metadata information associated with the image(s) and add that information to its database of information for each image.
- It will create a preview of each image, based on the specifications you set in the Preferences. NB: for raw files, Lightroom will initially display the .jpg image embedded in the raw file until it builds its own previews.
Now that Lightroom knows where the images are located, one should always move images and folders from within Lightroom’s Library module. Doing so ensures that Lightroom will update its pointers as to where the images are located on the drive. If you move an associated image outside of Lightroom, this connection gets broken and you end up with a ? in the top right of each image in the Library module. What that means is that as far as Lightroom is concerned, the image is ‘missing’. If you end up with ‘missing’ images, you can do the following:
1) Open Lightroom and go to the Library module.
2) Click the Library menu in the top toolbar and make sure ‘Show photos in subfolders’ is checked.
3) If the parent folder isn’t showing in the left panel of the Library module/folders panel, right-click on one of the top-most folders showing and select ‘Add Parent Folder’.
4) Once you have the topmost folder showing, click on it and all of your images should be present. Click on the ? mark of one image and select ‘Locate’. Navigate to the new location of that image and click on the image, creating a new file association. Lightroom should now update the locations of all of the missing files within that folder and subfolders.
5) Remember: Once images have been ‘imported’ (associated) with Lightroom, always move them within Lightroom.
The next thing to understand is that in Photoshop for example when you edit an image you’re actually editing the image part of the file – you’re changing pixels – and that’s why it’s important to work with layers to avoid damaging the base layer. Lightroom on the other hand uses a completely non-destructive workflow called ‘parametric editing’. Rather than altering the pixel information of the image, what it does is write a history of the steps taken in the Develop, Print or other modules. When you ‘Export’ an image from Lightroom it takes the image information from the original file, modifies it according to the history steps and creates a new image file according to the parameters you set.
Now: by default any changes made in Lightroom to either the metadata (adding keywords, copyright information, etc.) or the image (the history steps) are stored only in the Lightroom catalog. That’s why it’s important not to move images outside of Lightroom; it loses track of what information pertains to which image. That’s what the ? means. You can also instruct Lightroom to output that metadata information (including copyright info, keywords, etc. as well as the history steps) to the image metadata (the sidecar file in the case of raw files). You can either set Lightroom to do this automatically or you can do it by selecting one ore more images and pressing Cmd/Ctrl-S. To have Lightroom do this automatically for you, go to Edit/Catalog Settings (Win) or Lightroom/Catalog Settings (Mac), on the Metadata tab there’s an option to ‘Autowrite Changes to .XMP’ Check that box. The advantage of doing this automatically is that you have a continual backup of this information. The disadvantage is that it takes computer resources to do this and you may suffer a slowdown in performance. At the risk of being redundant, if you don’t save out this information, then all of the metadata information is stored only within the LR catalog and not with the image itself. This is fine as long as you use Lightroom exclusively as a file manager and you back up regularly.
When you import an image into Lightroom it reads the metadata stored with the image file. However, if you didn’t save the metadata out to the files and you re-import the image as a ‘new’ file, Lightroom won’t know that any changes have been made to it and will treat it as an unedited file.
Let’s construct a hypothetical scenario. Take Image 1, import it into Lightroom, and upon import Lightroom will read the metadata and gain whatever information is in there about the camera, lens, exposure, etc. Make some changes to the image in Lightroom, which are stored in the catalog (database). Now, move that file outside of Lightroom and you get a ? beside the image. When you relink the image at its new location back to the catalog Lightroom lines up all of the changes made in Lightroom with that image at its new location.
Take Image 2, import it into Lightroom and Lightroom will read the metadata, etc. Make some changes in Lightroom and once you’ve finished in the Develop module, write those changes to the image metadata. Now the information is stored in both the Lightroom catalog and the image metadata. Now, remove the image from the Lightroom catalog (but don’t delete it from the hard drive). Move that file to a new location outside of Lightroom. Import that file into Lightroom again and when Lightroom reads the metadata it will say, “Oh, I’ve already created a history file for this image” and show the image with the changes made in Lightroom. NB: Some things like Pick Flags, Collections and Virtual Copies are stored only in the Lightroom catalog no matter what.
Take Image 3, import it into Lightroom… Make some changes to the image, which Lightroom stores in its catalog. Do NOT write the changes to the image metadata. Remove the file from the Lightroom catalog (but don’t delete it from the hard drive). Move the image outside of Lightroom, or leave it where it is, doesn’t matter (because when you remove the image from the catalog Lightroom deletes the association it had with that file). Re-import that image into Lightroom and because all of the Develop settings, history, etc. were only stored in the catalog and not written to the image, when Lightrom imports the image it will read the camera, lens, exposure information, etc. from the image metadata but all of the keywords, edit changes, etc. made in Lightroom will be gone and it will treat it as a new file.
See the difference? If you store the metadata added by Lightroom to the image file before removing a file or folder from the catalog, when you re-import them Lightroom will read the edit information from the image metadata. If those images were removed from Lightroom where the edits, keywords, etc. were stored only in Lightroom, then you’ll still be able to re-import the original files but all of the changes you made in Lightroom will be gone.
In that event you may be able to get the information back, depending on whether or not you backed up the Lightroom catalog. If you did back up the catalog before you removed the images, go to the folder where your Lightroom catalog is stored there will be a Backups folder. Within that folder will be a number of folders, each with a backup date. Pick the most recent folder before the images were removed from Lightrom and within that folder will be a Lightroom catalog. Hold down the Alt/Opt key and click on the catalog to open Lightroom with that catalog. The metadata/history information should be there, although if the files/folders have been moved the images may have question marks. Relink them if necessary. Now, go to File/ Export as Catalog and export the catalog somewhere like the Desktop. Go to File/Open Catalog, open your default Lightroom catalog, then go to File/ Import from Catalog and select the catalog you just saved to your Desktop. If you cross your fingers and hold your tongue just right, it will add the files/folders that are not currently in the Lightroom catalog along with the Develop changes, etc.
Update: This is also worth reading: What Is Not Included In Lightroom XMP Files
I trust that makes some kind of sense! If you have any questions, please feel free to leave a comment below. Now go out and make some pictures!
P.S. II, the sequel. Digital Asset Management and metadata can get a lot more in depth than is presented here. If you’re a photo archivist or someone working in the field with many, many images and need more information, these links might help: