Home Articles MCU1 Flash Memory Analysis and Failures

MCU1 Flash Memory Analysis and Failures

Background

MCU is the 17” display and computer behind the display on the Model S and Model X built prior to 2021. It is used for entertainment, maps, navigation, and HVAC control.  It does not control the ability to drive the car, which means it can be rebooted or fail while driving and is not a safety-critical item.  Inside the MCU is flash memory, like a flash drive you might use to store music or dashcam video. The flash memory used is called eMMC or embedded Multi-Media Chip. Flash memory has a finite lifetime, and at some point, they fail. Some owners have had the MCU1 fail due to the flash memory failing, which can be expensive if it occurs outside of the 4-year, 50,000-mile warranty. Tesla has extended the eMMC in MCU1 warranty to 8 years / 100,000 miles. They also created a reimbursement program for those that had eMMC repairs done within the new warranty period.  More details in the Warranty Adjustment Program.  Even more recently, Tesla initiated a recall to replace the eMMC in MCU1.  More Tesla details at 8GB eMMC Recall Frequently Asked Questions (Aug-2022)

MCU1

The rear of a removed MCU1

Failures

While most cars have never had any MCU problems, some cars have had the MCU1 fail.  The most common failure is the internal flash memory, called eMMC, which is no longer able to reliably store data. When this occurs, owners may see erratic operation of the display, very slow responses, map rendering issues, reboots that occur by itself, reboots that take longer than 5 minutes, or a completely black screen. When totally failed, your phone app and Tesla service cannot remotely access the car.

Due to the way the memory is used, a high percentage of failures occur during a software update.  The update needs to write a lot of data into the flash memory in an unused area. If that area has previously failed, once the update is installed and it switches to that update, the system crashes.  It’s not a bug in the update, but the flash memory not working properly.

There are other possible causes of MCU1 failures, but they appear less frequently.  The 2nd most common issue is the touch operations become erratic and or are ignored completely.  This is usually a fault of the touch controller and a screen replacement corrects it. This is cheaper than having the entire MCU1 replaced.

There are likely a few other rare part failures in the MCU assembly, but for the rest of this article, we are going to only focus on eMMC failures. We also expect some MCU1 failures are not eMMC related, but we assume every reported failure is due to eMMC.

Failure Operation

Tesla has made some modifications to keep some functions operational in the event of failure. With several updates, the last being 2020.48.12, it will maintain defrosting if on, and set the cabin temperature to 72ºF.  In addition, the rear-view camera will appear on the screen.  The updates provide an early warning of eMMC problems, up to 6 months before it will fail.

Production

MCU1 design went into service in July 2012 with the first Model S.  It has been used in both the Model S and X through 28-Feb-2018.  After this, Tesla switched to a new design, called MCU2. During this time, about 317,000 vehicles were made with MCU1.

During the production, there were several minor design changes and one significant change – support of LTE in mid-2015.

In most cases, MCU1 is removed, opened up, and the CPU daughterboard with the eMMC memory is replaced with a daughterboard with a new 64 GB eMMC chip.  This chip is a better design as well, considering it was designed 10 years after the original part.

In some cases, MCU1 is replaced. Replacements are done with a remanufactured unit. The defective unit is returned to the factory for repair.  This is a standard practice in the automotive world.  The service technician does not have to identify what internal part may have failed but just swaps out the unit, such as MCU1.  All current remanufactured MCU1s include a different eMMC part that has at least 8 times the longevity of the original part, as the original part is no longer made. 

A replacement MCU1 also includes the LTE feature for no extra cost, a $500 option by itself, should you have an older Model S without LTE.  Note that in the USA, 3G phone service is expected to be terminated by phone carriers by February mid-2021.

Failure Rate

Akikiki has done some amazing work collecting data from owners of failed MCU1s.  This is a combination of reports in the Tesla forums and the TMC forums.  As of January 2021, we have 553 reported failures worldwide. This includes Tesla replacements MCU1s, MCU2 retrofits, eMMC daughterboard replacements, and third-party replacements of the eMMC chips.

Obviously, this is not the entire universe of failures, as not everyone reports problems to the forums.  Still, this is great information that puts a dimension on the issue.  Of the failed MCUs, 2 have not been identified with a vehicle year, and the others shown below have known vehicle years.

Vehicle Year Age (in 2020) MCU1 S/X Sales Failed MCU1s % Failed per year
2012 8 2,650 17 0.080%
2013 7 22,242 100 0.064%
2014 6 31,655 71 0.037%
2015 5 51,773 219 0.085%
2016 4 75,890 98 0.032%
2017 3 103,020 44 0.014%
2018 2 29,980 4 0.007%

The average failure rate per year is 0.046%.  We know this is not the entire universe of failures, as we expect plenty of people do not report failures in the forums.  Before we multiplied this by 15, but we now think there are more failures we don’t know about and increase this value to 25.  This is a bit of a guess, but we are saying our numbers only include 1/25 of the actual failures.

0.046% * 25 = 1.15% per year estimated failure rate.

Now we also have some data from Tesla: “Tesla said it has received 2,399 complaints and field reports, 7,777 warranty claims, and 4,746 non-warranty claims related to MCU replacements.” The 2,399 number likely overlaps many that had MCU1 fixed, but let’s lump them all together anyway for 14,922 failures.

So total failure over say 5 years is 14922/317,000 = 4.7%. Per year that’s 4.7/5 = 0.94% per year. Tesla’s numbers are very close to our analysis, which was done without any input from Tesla!

Let’s look at the data another way. Here we charted the failure rate for each model year represented by each color. The colored lines represent the vehicle year.  The X-axis, with years prefixed with “F”, is the year the failure rate for that year.

The peak failures occur for a 2012 Model S (in light blue) in 2020, or 8 years after the vehicle was made.  We combined S and X sales in all our numbers, but the Model X sales did not start until V-2015.  We left out a few failures that did not identify the model year or when MCU1 failed.

Now we’ve used the same data but added in some major events, the introduction of AP1 in 2014, AP2 in 2016, and when Linux logging was stopped in late 2019.

It does look like AutoPliot increased the failure rate and removing the Linux logging might have helped. Consider the V-2012 vehicles, shown in the light blue line, do not have AP1 or AP2 yet failures increase sharply in year 6.  For the year 2015 vehicles with AP1, the sharp increase occurs in year 5.  It’s too early to tell the effect of AP2 on 2017 vehicles.

Tesla has also made a number of optimizations to the utilization and management of the eMMC flash memory. Tesla strongly recommends you have version 2020.20 or later to get all these optimizations.  In addition, should the eMMC start to fail, versions at 2020.20 and later will default the CPU into a safe mode that provides defrost, cabin heat/cool, and rear-view display.

So, will your MCU1 fail?  That’s impossible to answer, but we can at least shed some light on some factors that affect the life of the eMMC.

eMMC Longevity and Wear

On the number of cycles before failure, there are many unknown variables, which makes it difficult to predict when a failure may occur or if there is anything you can do to extend its life.  There are many vehicles on the road that are over 7 years old and/or have over 200,000 miles and the MCU1 is still working fine.  On the flip side, there have been a few failures in cars less than 4 years old and less than 30,000 miles. We’ll look at why there can be so much variance in longevity.

The chip itself. There isn’t a hard count limit where a byte fails. It likely varies depending on micro variances in the making of the chip. You could have one chip lasts it’s rated 3000 cycles and another that lasts 10000 cycles. The manufacturer only specifies the minimum cycle life.

Data Values – You could have a byte fail, but the data written to it happens to match the failure, so it is not yet marked as bad. For example, if a byte fails to all zeros, and you write zero to that location, it is still valid, and no-fault occurs.

Block write – Flash memory writes in a block of bytes. In the SK Hynix eMMC used by Tesla, a block is 448 bytes. The entire block is always written if any byte changes within that block. If after a write, the block does not match what was written the group is marked bad, and it is re-written into another block. This happens in the background and depends on how many free blocks are available. When the available blocks are exhausted, the chip has nowhere to write the data. What happens at that point is unclear, but you are going to lose data. Later when the CPU attempts to read code or data values from a corrupted section of memory, things can go awry.

Temperature – There are other smaller factors, such as temperatures that are likely to affect longevity. The specs provide a minimum cycle life over the temperature range, but I’d expect longer cycle life at one or the other ends of the temperature range. So if your car is always hot or always cold, it might last longer or shorter. I have no idea which way is better or how much effect it has.

Frequency of data change – Code updates are very infrequent and would not cause cycle life failures. I estimate there have been less than 100 code updates since 2012.

Various settings are also stored in the eMMC memory, but those values do not change often.

There are server logs, and some thought this was the cause of early failures. A typical Linux server log is quite small and these changes, while frequent, should last far beyond the life of the car. Others disagree with me, but no one has done an extensive logic-analyzer evaluation. Near the end of 2019, Tesla disabled the internal server log to reduce writes. These logs are only useful for Tesla’s engineers during development.

There are vehicle logs, data such as charging history, and vehicle faults. I don’t know how much data is collected; I don’t expect this is a significant amount of data or that it causes a lot of writes.

There are other items that may be saved, such as streaming audio. I suspect it is saved in RAM, but if some of it is saved in the eMMC, such as the buffered song when downloaded, this is a large amount of memory, about 6 MB for a 3-minute song. The use of streaming varies a lot by the user. Some owners never use it, while others are frequent streamers. More likely the music file is only saved if the MCU is powered off. Probably this is not a concern.

For those that use a USB for music, the songs are indexed and the index is stored in the eMMC. Prior to Version 10 software, the indexing also stored thumbnail album cover images. This may have been a large amount of data for those with a lot of songs.  The data is re-indexed when the USB is disconnected and reconnected, but this occurs infrequently.

Navigation data is saved to flash, but how much is unclear. Web history is also saved. Some owners may use these features more frequently than others and may increase the number of write cycles. Most of this data seems quite small, so I would not expect it to have much effect.

Some Tesla techs have stated the trip counters consume memory and cause more eMMC use.  While I am sure it uses some memory, I doubt it uses much compared with other items.

AutoPilot has a lot of information that is recorded for operations that did not occur in pre-AP cars. While most of the data is never saved, if even a small fraction is saved, it could be a huge amount with video files or images. There has been talk that AP data is also being saved even if you are not using AP, but perhaps more data is saved when AP is active.  This may be why the year 2015 cars seem to have failures earlier in life than the 2012 vehicles.

Memory Controller Design Problem – A few people have suggested there is a design problem in the chip’s memory controller. We cannot substantiate such a problem, but it is always possible. Perhaps the memory controller is creating more writes than are needed, causing excessive wear.  Considering several companies use these parts in volume, I would expect more news about it if were true.

eMMC Parts

Tesla bought the processor board from Nvidia, and the Nvidia board contains the eMMC from SK Hynix. Unlike consumer flash memory, the eMMC is soldered to the PCB via a 153-ball pad array.

A fixer in Europe, reported the following parts being used on the boards:

H26M42001FMR in 2012 and 2013 cars

H26M42002GMR in ~ 2014 cars

H26M42003GMR in 2015+ cars.

eMMC

These chips are all made by SK Hynix, are 8 GB parts, and are MLC (Multi-Level Cell) design.  The chips include a flash controller to make the memory accessible to software like a disk drive.  We have been unable to find any significant difference between the parts, other than small tweaks to the controller specifications.  We do not believe the different parts would have any functional or longevity differences.

The minimum write cycle life is 3,000 cycles, and data retention is rated at 10 years. The 42001 and 42002 parts adhere to the MMC 4.41 standard and the 42003 part uses the MMC 4.5 standard. In this application, the parts can be considered interchangeable.

Interestingly, we found that the same part, the H26M42003GMR, is used in the Microsoft X-box One.

All three of these parts are considered obsolete now by Hynix and are no longer manufactured.  I was able to find some small inventory of the last two versions for sale, but none of the first versions. There are now larger and better parts available that are functionally the same.

Replacement of the eMMC is quite tricky. First, you need to extract some key information from the chip and transfer the data to a new chip before removal.  Next, you have to remove the eMMC without harming the PCB – an effort that few have the right equipment or skill.  Then with the replacement part pre-loaded with the data extracted from the original chip, you need to solder it to the PCB, again with specialized equipment.


A video of the MCU1 disassembly and eMMC replacement process (courtesy of B1Zteam)

Some third-party vendors are replacing the SK Hynix eMMC with a Swissbit SFEM064GB1EA1 or similar chip formatted to 32 GB for compatibility.  This should last far longer than the original due to its size and pSLC endurance mode, perhaps as much as 28 times longer. This part retails for $67 today.

For replacements done by Tesla, they are using a 64 GB Micron eMMC, and possibly another vendor.

Safety Concerns

Some owners think an MCU failure is a safety concern.  While we don’t share that view, if your MCU fails and you consider it unsafe to drive, you should stop driving and have the car towed, and have the MCU1 repaired.

The car drives fine without MCU1 working, but some vehicle features may not work or may work in limited ways.  Some items like exterior turn signals should still work, but you may not get any inside audible confirmation or indication on the screen. The rear camera does not appear when you are in reverse unless you have version 2020.20 or later software.

There have been some owners reporting an MCU1 failure as a safety item to the NHTSB due to the lack of a rear camera video. Rear camera video is only required on all cars made starting in May 2018.  More details on the NHTSB Model S, are under investigation.

Since a rear video camera was not required until after MCU1 was discontinued by Tesla and replaced by MCU2, it seems unlikely the NHTSB will consider an MCU1 failure a safety concern, but they did. This is even with the change that provides a rearview camera on-screen even if the eMMC fails.  In the past, safety concerns have focused on items that cause an immediate accident, such as a wheel breaking off, an airbag issue or the engine catching fire.

Warranty

The MCU is covered by the vehicle’s 4-year, 50,000-mile warranty and as of November 2020, Tesla has extended the warranty of the eMMC in MCU1 to 8 years and 100,000 miles.  Now Tesla has implemented a recall on the eMMC, so the warranty period is not important.

Solutions

Have Tesla fix it.  There is no cost. This should last at least 8 times longer, and depending on the chip design, perhaps a lot longer. 

There are alternative vendors that can take your MCU1 and replace the bad eMMC part. These were useful when Tesla was charging for eMMC repairs, but we’ve retained this information, as not all countries may get the recall.

To use these vendors, you may have to remove MCU1 from your car, a difficult process for most people. Some vendors offer a full service – removal and eMMC replacement. In addition, you may be without a drivable car until you get the replacement back. This process only works if the vendor can extract key information from the damaged eMMC and transfer the data to the new part.  It sounds like they have a high success rate, but you should contact the vendor to get a better feel for the success rate.

removed parts

Stack of removed eMMC parts before being destroyed (courtesy of CCI Tesla)

I’ve listed a number of vendors that offer Tesla eMMC data recovery and replacement services. I believe all offer mail-in services from any area. Links are for the vendor’s email or websites. Please contact those you are interested in to find out their full capabilities, pricing, and service areas.

USA Vendors

  • AppleGuru – Boston, Massachusetts area – Also offers mobile MCU removal/installation services
  • Electrified Garage – Florida area; (352) 354-9006
  • Electrified Garage – Massachusetts area; (978) 206-1811
  • EVFixme – Torrance, California area; (949) 682-8261 – Also offers MCU removal/installation services
  • EvFixmeNorCa – San Francisco, California area; (415) 287-9963
  • Gruber Motor Company – Phoenix, AZ area; (602) 863-2655 x500 – Also offers MCU removal/installation services
  • Theo – Maryland area

Outside USA Vendors

  • eMMC Repair – Denmark (serving Norway and Sweden too); +45 9696 1111
  • Laadkabel Winkel – Netherlands; 040-3041027 – Also offers MCU removal/installation services
  • TXS – Italy and travel to France/Switzerland; +39.347.4118324 – Also offers MCU removal/installation services

MCU2 Upgrade Option

Another alternative for USA owners is to buy the MCU2 upgrade from Tesla. This uses a flash memory that is 16 times larger, 128 GB.  If everything were identical, that should mean 16 times longevity. Now the MCU2 upgrade offers a lot of additional features, but the older analog AM/FM/XM radio is not compatible with MCU2.  We go into detail in our article on MCU2 Upgrade and AM/FM/XM Radio issues with alternatives and solutions.  Tesla now offers a radio retrofit for $500.

The cost of the MCU2 retrofit, including installation, is $2250 for S/X prior to HW2.0. For vehicles with HW2.x, the retrofit cost is $1750. The MCU2 upgrade also replaces the instrument cluster and includes LTE if you have an old car without LTE.  For HW2.x cars, Tesla appears to be also upgrading the AP processor to HW3.0, even if you haven’t purchased FSD yet.

Should I proactively Replace my eMMC?

We recommend that you have Tesla deal with any failures unless you’re interested in the MCU2 upgrade. With the recall, Tesla should deal with it. The downside is it may take months for Tesla to get to your car.  Those that have automatically detected eMMC faults are going to be at the head of the queue.

The failure rate appears somewhat low. Some sound the alarm that every MCU1 will fail. That is true, but it could be 10 more years before your specific unit fails. We really don’t have enough data to state when or if your MCU1 will fail. Some have failed in as little as 4 years.  Vehicle usage does appear to be a factor. 

Wait, there’s more!

Here are some of the links to various other sources, some with differing opinions than ours or were written in the early days before the problem was better understood.