SNIA Compute, Memory and Storage Blog

30 Speakers Highlight AI, Memory, Sustainability, and More at the May 21-22 Summit!

May 1, 2024May 1, 2024 SNIA CMSI Leave a comment

SNIA Compute, Memory, and Storage Summit is where solutions, architectures, and community come together. Our 2024 Summit – taking place virtually on May 21-22, 2024 – is the best example to date, featuring a stellar lineup of 30 speakers in sessions on artificial intelligence, the future of memory, sustainability, critical storage security issues, the latest on CXL®, UCIe™, and Ultra Ethernet, and more.

“We’re excited to welcome executives, architects, developers, implementers, and users to our 12^th annual Summit,” said David McIntyre, Compute, Memory, and Storage Summit Chair and member of the SNIA Board of Directors. “Our event features technology leaders from companies like Dell, IBM, Intel, Meta, Samsung – and many more – to bring us the latest developments in AI, compute, memory, storage, and security in our free online event. We hope you will attend live to ask questions of our experts as they present and watch those you miss on-demand.“

Artificial intelligence sessions sponsored by the SNIA Data, Networking & Storage Forum feature J Michel Metz of the Ultra Ethernet Consortium (UEC) on powering AI’s future with the UEC, John Cardente of Dell on storage requirements for AI, Jeff White of Dell on edgenuity, and Garima Desai of Samsung on creating a sustainable semiconductor industry for the AI era. Other AI sessions include Manoj Wadekar of Meta on the evolution of hyperscale data centers from CPU centric to GPU accelerated AI, Paul McLeod of Supermicro on storage architecture optimized for AI, and Prasad Venkatachar of Pliops on generative AI data architecture.

Memory sessions begin with Jim Handy and Tom Coughlin on how memories are driving big architectural changes. Ahmed Medhioub of Astera Labs will discuss breaking through the memory wall with CXL, and Sudhir Balasubramanian and Arvind Jagannath of VMware will share their memory vision for real world applications.

Compute sessions include Andy Walls of IBM on computational storage and real time ransomware detection, JB Baker of ScaleFlux on computational storage real world deployments, Dominic Manno of Los Alamos National Labs on streamlining scientific workflows in computational storage, and Bill Martin and Jason Molgaard of the SNIA Computational Storage Technical Work Group on computational storage standards.

CXL will be featured with a CXL Consortium panel on increasing AI and HPC application performance with CXL fabrics, a presentation from Larrie Carr of Rambus on proprietary internconnects and CXL, and a session from Samsung and Broadcom on bringing unique customer value with CXL accelerator-based memory solutions.

Richelle Ahlvers and Brian Rea of the UCI Express will discuss enabling an open chipset system with UCIe.

The Summit will also dive into security with a number of presentations on this important topic.

And there is much more, including a memory Birds-of-a-Feather session, a live Memory Workshop and Hackathon featuring CXL exercises, and opportunities to chat with our experts! Check out the agenda and register for free!

Power Efficiency Measurement – Our Experts Make It Clear – Part 4

April 29, 2024April 29, 2024 Eden Kim, Keith Orsak and Chuck Paradon Leave a comment

Measuring power efficiency in datacenter storage is a complex endeavor. A number of factors play a role in assessing individual storage devices or system-level logical storage for power efficiency. Luckily, our SNIA experts make the measuring easier!

In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization.

Join us on our final installment on the journey to better power efficiency – Part 4: Impact of Storage Architectures on Power Efficiency Measurement.

And if you missed our earlier segments, click on the titles to read them: Part 1: Key Issues in Power Efficiency Measurement, Part 2: Impact of Workloads on Power Efficiency Measurement, and Part 3: Traditional Differences in Power Consumption: Hard Disk Drives vs Solid State Drives. Bookmark this blog series and explore the topic further in the SNIA Green Storage Knowledge Center.

Impact of Storage Architectures on Power Efficiency Measurement

Ultimately, the interplay between hardware and software storage architectures can have a substantial impact on power consumption. Optimizing these architectures based on workload characteristics and performance requirements can lead to better power efficiency and overall system performance.

Different hardware and software storage architectures can lead to varying levels of power efficiency. Here’s how they impact power consumption.

Hardware Storage Architectures

HDDs v SSDs:
Solid State Drives (SSDs) are generally more power-efficient than Hard Disk Drives (HDDs) due to their lack of moving parts and faster access times. SSDs consume less power during both idle and active states.
NVMe® v SATA SSDs:
NVMe (Non-Volatile Memory Express) SSDs often have better power efficiency compared to SATA SSDs. NVMe’s direct connection to the PCIe bus allows for faster data transfers, reducing the time components need to be active and consuming power. NVMe SSDs are also performance optimized for different power states.
Tiered Storage:
Systems that incorporate tiered storage with a combination of SSDs and HDDs optimize power consumption by placing frequently accessed data on SSDs for quicker retrieval and minimizing the power-hungry spinning of HDDs.
RAID Configurations:
Redundant Array of Independent Disks (RAID) setups can affect power efficiency. RAID levels like 0 (striping) and 1 (mirroring) may have different power profiles due to how data is distributed and mirrored across drives.

Software Storage Architectures

Compression and Deduplication:
Storage systems using compression and deduplication techniques can affect power consumption. Compressing data before storage can reduce the amount of data that needs to be read and written, potentially saving power.
Caching:
Caching mechanisms store frequently accessed data in faster storage layers, such as SSDs. This reduces the need to access power-hungry HDDs or higher-latency storage devices, contributing to better power efficiency.
Data Tiering:
Similar to caching, data tiering involves moving data between different storage tiers based on access patterns. Hot data (frequently accessed) is placed on more power-efficient storage layers.
Virtualization
Virtualized environments can lead to resource contention and inefficiencies that impact power consumption. Proper resource allocation and management are crucial to optimizing power efficiency.
Load Balancing:
In storage clusters, load balancing ensures even distribution of data and workloads. Efficient load balancing prevents overutilization of certain components, helping to distribute power consumption evenly
Thin Provisioning:
Allocating storage on-demand rather than pre-allocating can lead to more efficient use of storage resources, which indirectly affects power efficiency

Just What is an IOTTA? Inquiring Minds Learn Now!

April 9, 2024April 9, 2024 SNIA CMSI Leave a comment

SNIA’s twelve Technical Work Groups collaborate to develop and promote vendor-neutral architectures, standards, and education for management, movement, and security for technologies related to handling and optimizing data. One of the more unique work groups is the SNIA Input/Output Traces, Tools, and Analysis Technical Work Group (IOTTA TWG).

SNIA Compute, Memory, and Storage Initiative recently sat down with IOTTA TWG Chairs Geoff Kuenning of Harvey Mudd College and Tom West of hyperI/O LLC to learn about some exciting new developments in their work activities and how SNIA members and colleagues can get involved.

Q: What does the IOTTA TWG do?

A: The IOTTA TWG is for those interested in the use of empirical data/metrics to better understand the actual operation and performance characteristics of storage I/O, especially as they pertain to application workloads. We summarize our work in this SNIA video https://www.youtube.com/watch?v=4EVW5IHHhEk

One of our most important activities is to sponsor a collaborative worldwide repository for storage-related I/O trace collection and analysis tools, application workloads, I/O traces, and best practices around such topics.

Q: What are the goals of the IOTTA Repository collaboration?

A: The primary goal of the IOTTA Repository collaboration is to create a worldwide repository for storage related I/O trace files, associated tools, and other related information, all of which are made available free of charge to the storage research and development communities in both academia and industry.

Repository data is often cited in research publications, with 627 citations to date listed on the IOTTA Repository website.

Q: Why is keeping and sharing information by way of a Repository important?

A: The IOTTA Repository provides a common facility through which a broad community (including storage vendors, storage users, and the academic community) can avail themselves of a variety of storage related I/O traces (especially contemporary I/O traces). We like to think of it as a “One-Stop-Shop”.

Q: What kind of information are you gathering for the Repository? Is some information more important than other(s)?

A: The Repository contains a wide variety of storage related I/O trace types, including Block I/O, HPC Summaries, Key-Value Traces, NFS Traces, Parallel Traces, Static Snapshots, System Call Traces, and Workload Summaries.

Reliability Traces are the latest category of traces added to the IOTTA Repository. Generally, the Reliability Traces category includes records of storage system reliability, for example, long-term records of hard-drive failures.

The IOTTA Repository additionally provides an off-site link to traces that cannot be included directly within the repository (e.g., unable to obtain permission to host a particular trace within the repository).

Q: Who downloads this information? What groups can make use of this information?

A: Academic institutions are among the most frequent downloaders of Repository information, along with storage companies.

Practitioners can make use of various IOTTA Repository traces to gain a better understanding of actual I/O storage operation activity within various environments and scenarios. Traces can also be used as a basis for benchmarking and testing proposed solutions.

SNIA IOTTA TWG members receive a monthly report that shows the number and types (i.e., trace names) of the traces downloaded during the month, including the downloader region (e.g., Asia, Europe, North America). The report also includes company/institution names associated with the downloaders. More information on joining the IOTTA TWG is at http://iotta.snia.org/faqs/joinIOTTA.

Q: What is some of the latest information in the Repository?

A: In February 2024, we posted NVMe drive reliability traces collected by Alibaba. The collection includes both fail-stop and fail-slow data for a large drive population in Alibaba’s servers.

Q: What is the importance of these traces?

A: The authors of the associated USENIX ATC 2022 paper indicate that the Alibaba Fail-Stop dataset is the first large-scale public dataset on real-world operational data of NVMe SSD. From their analysis of the dataset, they identified a series of major reliability changes in NVMe SSD.

In addition, the authors of the associated USENIX FAST 2023 paper indicate that the Alibaba Fail-Slow dataset is the first large-scale, clear-labeled public dataset on real-world operational traces aiming at fail-slow detection (i.e., where the drive continues to run but with poor performance). Based upon the dataset, the authors have provided a root cause analysis on fail-slow drives.

With the growing importance of NVMe SSDs in the data center, it is critical to understand the reliability of hardware in the cloud. The Repository provides the traces download and also links to the papers and presentation videos that discuss these large-scale SSD reliability studies.

Q: What new activity would you like to see in the Repository?

A: We’d like to see more trace downloads for analysis. Most downloads today are related to benchmarking and replay. Trace activity could feed into a simulated computer system to test activities like failures.

We would also like to see more input of data related to tape storage. The Repository does not have much information on cold storage and multilevel storage between hot and cold storage.

Finally, we would like feedback on how people are using what they download – for analysis, reliability, benchmarks and other areas they have found the downloads useful. We also want to know what else you would like to be able to download. You can contact us directly at iottachairs@snia.org.

Thanks for your time and the great information about the IOTTA Repository. Learn more about the IOTTA Repository on their FAQ page.

2024 Year of the Summit Kicks Off – Meet us at MemCon

March 6, 2024March 13, 2024 SNIA CMSI Leave a comment

2023 was a great year for SNIA CMSI to meet with IT professionals and end users in “Summits” to discuss technologies, innovations, challenges, and solutions. Our outreach at six industry events reached over 16,000 and we thank all who engaged with our CMSI members.

We are excited to continue a second “Year of the Summit” with a variety of opportunities to network and converse with you. Our first networking event will take place March 26-27, 2024 at MemCon in Mountain View, CA.

MemCon 2024 focuses on systems design for the data centric era, working with data-intensive workloads, integrating emerging technologies, and overcoming data movement and management challenges. The agenda includes presentations and panels, featuring speakers from Meta, Microsoft, Netflix, Samsung, and Warner Brothers. It’s the perfect event to discuss the integration of SNIA’s focus on developing global standards and delivering education on all technologies related to data. SNIA and MemCon have prepared a video highlighting several of the key topics to be discussed.

At MemCon, SNIA CMSI member and SDXI Technical Work Group Chair Shyam Iyer of Dell will moderate a panel discussion on How are Memory Innovations Impacting the Total Cost of Ownership in Scaling-Up and Power Consumption , discussing impacts on hyperscalers, AI/ML compute, and cost/power.

SNIA Board member David McIntyre will participate in a panel on How are Increased Adoption of CXL, HBM, and Memory Protocol Expected to Change the Way Memory and Storage is Used and Assembled? , with insights on the markets and emerging memory innovations. The full MemCon agenda is here.

In the exhibit area, SNIA leaders will be on hand to demonstrate updates to the SNIA Persistent Memory Programming Workshop featuring new CXL® memory modules (get an early look at our Programming exercises here) and to provide a first look at a Smart Data Accelerator Interface (SDXI) specification implementation. We’ll also provide updates on SNIA technical work on form factors like those used for CXL. We will feature a drawing for gift cards at the SNIA hosted coffee receptions and at the Tuesday evening networking reception.

SNIA colleagues and friends can register for MemCon with a 15% discount using code SNIA15.

And stay tuned for engaging with SNIA at upcoming events in 2024, including a return of the SNIA Compute, Memory, and Storage Summit in May 2024, August 2024 FMS-the Future of Memory and Storage; SNIA SDC in September, and SC24 in Atlanta in November 2024. We’ll discuss each of these in depth in our Year of the Summit blog series.

Power Efficiency Measurement – Our Experts Make It Clear – Part 3

March 4, 2024March 4, 2024 Eden Kim, Keith Orsak and Chuck Paradon Leave a comment

In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization.

Join us on our journey to better power efficiency as we continue with Part 3: Traditional Differences in Power Consumption: Hard Disk Drives vs Solid State Drives. And if you missed our earlier segments, click on the titles to read them: Part 1: Key Issues in Power Efficiency Measurement, and Part 2: Impact of Workloads on Power Efficiency Measurement.. Bookmark this blog and check back in April for the final installment of our four-part series. And explore the topic further in the SNIA Green Storage Knowledge Center.

Traditional Differences in Power Consumption: Hard Disk Drives vs Solid State Drives

There are significant differences in power efficiency between Hard Disk Drives (HDDs) and Solid State Drives (SSDs). While some commentators have examined differences in power efficiency measurement for HDDs v SSDs, much of the analysis has not accounted for the key power efficiency contributing factors outlined in this blog.

As a simple generalization at the individual storage device level, HDDs show higher power consumption than SSDs. In addition, SSDs have higher performance (IOPS and MB/s) often by an order of magnitude or more. Hence, cursory consideration of device power efficiency measurement, expressed as IOPS/W or MB/s/W, will typically favor the faster SSD with lower device power consumption.

On the other hand, depending on the workload and IO transfer size, HDD devices and systems may exhibit better IOPS/W and MB/s/W if measured to large block sequential RW workloads where head actuators can reside on the disk OD (outer diameter) with limited seek accesses.

The above traditional HDD and SSD power efficiency considerations can be described at the device level as involving the following key points:

HDDs (Hard Disk Drives):

Mechanical Components: HDDs consist of spinning disks and mechanical read/write heads. These moving parts consume a substantial amount of power, especially during startup and when seeking data.
Idle Power Consumption: Even when not actively reading or writing data, HDDs still consume a notable amount of power to keep the disks spinning and ready to access data
Access Time Impact: The mechanical nature of HDDs leads to longer access times compared to SSDs. This means the drive remains active for longer periods during data access, contributing to higher power consumption.

SSDs (Solid State Drives):

No Moving Parts: SSDs are entirely electronic and have no moving parts. As a result, they consume less power during both idle and active states compared to HDDs
Faster Access Times: SSDs have much faster access times since there are no mechanical delays. This results in quicker data retrieval and reduced active time, contributing to lower power consumption
Energy Efficiency: SSDs are generally more energy-efficient, as they consume less power during read and write operations. This is especially noticeable in laptops and portable devices, where battery life is critical
Less Heat Generation: Due to their lack of moving parts, SSDs generate less heat during operation, which can lead to better thermal efficiency in systems.

In summary, SSDs tend to be more power-efficient than HDDs due to their lack of mechanical components, faster access times, and lower energy consumption during both active and idle states. This power efficiency advantage is one of the reasons why SSDs have become increasingly popular in various computing devices, from laptops to data centers.

Emerging Memories Branch Out – a Q&A

February 19, 2024February 19, 2024 SNIA CMSI Leave a comment

Our recent SNIA Persistent Memory SIG webinar explored in depth the latest developments and futures of emerging memories – now found in multiple applications both as stand-alone chips and embedded into systems on chips. We got some great questions from our live audience, and our experts Arthur Sainio, Tom Coughlin, and Jim Handy have taken the time to answer them in depth in this blog. And if you missed the original live talk, watch the video and download the PDF here.

Q: Do you expect Persistent Memory to eventually gain the speeds that exist today with DRAM?

A: It appears that that has already happened with the hafnium ferroelectrics that SK Hynix and Micron have shown. Ferroelectric memory is a very fast technology and with very fast write cycles there should be every reason for it to go that way. With the hooks that are in CXL™, , though, that shouldn’t be that much of a problem since it’s a transactional protocol. The reads, then, will probably rival DRAM speeds for MRAM and for resistive RAM (MRAM might get up to DRAM speeds with its writes too). In fact, there are technologies like spin-orbit torque and even voltage-controlled magnetic anisotropy that promise higher performance and also low write latency for MRAM technologies. I think that probably most applications are read intensive and so the read is the real place where the focus is, but it does look like we are going to get there.

Q: Are all the new Memory technology protocols (electrically) compatible to DRAM interfaces like DDR4 or DDR5? If not, then shouldn’t those technologies have lower chances of adoption as they add dependency on custom in-memory controller?

A: That’s just a logic problem. There’s nothing innate about any memory technology that couples it tightly with any kind of a bus, and so because NOR Flash and SRAM are the easy targets so far, most emerging technologies have used a NOR flash or SRAM type interface. However, in the future they could use DDR. There’re some special twists because you don’t have to refresh emerging memory technologies. but you know in general they could use DDR.

But one of the beauties of CXL is that you put anything you want to with any kind of interface on the other side of CXL and CXL erases what the differences are. It moderates them so although they may have different performances it’s hidden behind the CXL network. Then the burden goes on to the CXL controller designers to make sure that those emerging technologies, whether it’s MRAM or others, can be adopted behind that CXL protocol. My expectation is for there to be a few companies early on who provide CXL controllers that that do have some kind of a specialty interface on them whether it’s for MRAM or for Resistive RAM or something like that, and then eventually for them to move their way into the mainstream. Another interesting thing about CXL is that we may even see a hierarchy of different memories within CXL itself which also includes as part of CXL including domain specific processors or accelerators that operate close to memory, and so there are very interesting opportunities there as well. If you can do processing close to memory you lower the amount of data you’re moving around and you’re saving a lot of power for the computing system.

Q: Emerging memory technologies have a byte-level direct access programming model, which is in contrast to block-based NAND Flash. Do you think this new programming model will eventually replace NAND Flash as it reduces the overhead and reduces the power of transferring Data?

A: It’s a question of cost and that’s something that was discussed very much in our webinar. If you haven’t got a cost that’s comparable to NAND Flash, then you can’t really displace it. But as far as the interface is concerned, the NAND interface is incredibly clumsy. All of these technologies do have both byte interfaces rather than a block interface but also, they can write in place – they don’t need to have a pre-erased block to write into. That from a technical standpoint is a huge advantage and now it’s just a question of whether or not they can get the cost down – which means getting the volume up.

Q: Can you discuss the High Bandwidth Memory (HBM) trends? What about memories used with Graphic Processing Units (GPUs)?

A: That topic isn’t the subject of this webinar as this webinar is about emerging memory technologies. But, to comment, we don’t expect to see emerging memory technologies adopt an HBM interface anytime in the really near future because HBM does springboard off DRAM and, as we discussed on one of the slides, DRAM has a transition that we don’t know when it’s going to happen that it goes to another emerging memory technology. We’ve put it into the early 2030s in our chart, but it could be much later than that and HBM won’t convert over to an emerging memory technology until long after that.

However, HBM involves stacking of chips and that ultimately could happen. It’s a more expensive process right now – a way of getting a lot of memory very close to a processor – and if you look at some of the NVIDIA applications for example, this is an example of the Chiplet technology and HBM can play a role in those Chiplet technologies for GPUs.. That’s another area that’s going to be using emerging memories as well – in the Chiplets. While we didn’t talk about that so much in this webinar, it is another place for emerging memories to be playing a role.

There’s one other advantage to using an emerging memory that we did not talk about: emerging memories don’t need refresh. As a matter of fact, none of the emerging memory technologies need refresh. More power is consumed by DRAM refreshing than by actual data accesses. And so, if you can cut that out of it, you might be able to stack more chips on top of each other and get even more performance, but we still wouldn’t see that as a reason for DRAM to be displaced early on in HBM and then later on in the mainstream DRAM market. Although, if you’re doing all those refreshes there’s a fair amount of potential of heat generation by doing that, which may have packaging implications as well. So, there may be some niche areas in there which could be some of the first ways in which some of these emerging memories are potentially used for those kinds of applications, if the performance is good enough.

Q: Why have some memory companies failed? Apart from the cost/speed considerations you mention, what are the other minimum envelope features that a new emerging memory should have? Is capacity (I heard 32Gbit multiple times) one of those criteria?

A: Shipping a product is probably the single most important activity for success. Companies don’t have to make a discrete or standalone SRAM or emerging memory chip but what they need to do is have their technology be adopted by somebody who is shipping something if they’re not going to ship it themselves. That’s what we see in the embedded market as a good path for emerging memory IP: To get used and to build up volume. And as the volume and comfort with manufacturing those memories increase, it opens up the possibility down the road of lower costs with higher volume standalone memory as well.

Q: What are the trends in DRAM interfaces? Would you discuss CXL’s role in enabling composable systems with DRAM pooling?

A: CXL, especially CXL 3.0, has particularly pointed at pooling. Pooling is going to be an extremely important development in memory with CXL, and it’s one of the reasons why CXL probably will proliferate. It allows you to be able to allocate memory which is not attached to particular server CPUs and therefore to make more efficient and effective use of those memories. We mentioned this earlier when we said that right now DRAM is that memory with some NAND Flash products out there too. But this could expand into other memory technologies behind CXL within the CXL pool as well as accelerators (domain specific processors) that do some operations closer to where the memory lives. So, we think there’s a lot of possibilities in that pooling for the development and growth of emerging memories as well as conventional memories.

Q: Do you think molecular-based technologies (DNA or others) can emerge in the coming years as an alternative to some of the semiconductor-based memories?

A: DNA and other memory technologies are in a relatively early stage but there are people who are making fairly aggressive plans on what they can do with those technologies. We think the initial market for those molecular memories are not in this high performance memory application; but especially with DNA, the potential density of storage and the fact that you can make lots of copies of content by using genetic genomic processes makes them very attractive potentially for archiving applications. The things we’ve seen are mostly in those areas because of the performance characteristics. But the potential density that they’re looking at is actually aimed at that lower part of the market, so it has to be very, very cost effective to be able to do that, but the possibilities are there. But again, as with the emerging high performance memories, you still have the economies of scale you have to deal with – if you can’t scale it fast enough the cost won’t go down enough that will actually will be able to compete in those areas. So it faces somewhat similar challenges, though in a different part of the market.

Earlier in the webcast, we said when showing the orb chart, that for something to fit into the computing storage hierarchy it has to be cheaper than the next faster technology and faster than the next cheaper technology. DNA is not a very fast technology and so that automatically says it has to be really cheap for it to catch on and that puts it in a very different realm than the emerging memories that we’re talking about here. On the other hand, you never know what someone’s going to discover, but right now the industry doesn’t know how to make fast molecular memories.

Q: What is your intuition on how tomorrow’s highly dense memories might impact non-load/store processing elements such as AI accelerators? As model sizes continue to grow and energy density becomes more of an issue, it would seem like emerging memories could thrive in this type of environment. Your thoughts?

A: Any memory would thrive in an environment where there was an unbridled thirst for memory. as artificial intelligence (AI) currently is. But AI is undergoing some pretty rapid changes, not only in the number of the parameters that are examined, but also in the models that are being used for it. We recently read a paper that was written by Apple* where they actually found ways of winnowing down the data that was used for a large language model into something that would fit into an Apple MacBook Pro M2 and they were able to get good performance by doing that. They really accelerated things by ignoring data that didn’t really make any difference. So, if you take that kind of an approach and say: “Okay. If those guys keep working on that problem that way, and they take it to the extreme, then you might not need all that much memory after all.” But still, if memory were free, I’m sure that there’d be a ton of it out there and that is just a question of whether or not these memories can get cheaper than DRAM so that they can look like they’re free compared to what things look like today.

There are three interesting elements of this: First, CXL, in addition allowing mixing of memory types, again allows you to put in those domain specific processors as well close to the memory. Perhaps those can do some of the processing that’s part of the model, in which case it would lower the energy consumption. The other thing it supports is different computing models than what we traditionally use. Of course there is quantum computing, but there also is something called neural networks which actually use the memory as a matrix multiplier, and those are using these emerging memories for that technology which could be used for AI applications. The other thing that’s sort of hidden behind this is that spin tunnelling is changing processing itself in that right now everything is current-based, but there’s work going on in spintronic based devices that instead of using current would use the spin of electrons for moving data around, in which case we can avoid resistive heating and our processing could run a lot cooler and use less energy to do so. So, there’s a lot of interesting things that are kind of buried in the different technologies being used for these emerging memories that actually could have even greater implications on the development of computing beyond just the memory application themselves. And to elaborate on spintronics, we’re talking about logic and not about spin memory – using spins rather than that of charge which is current.

Q: Flash has an endurance issue (maximum number of writes before it fails). In your opinion, what is the minimum acceptable endurance (number of writes) that an emerging memory should support?

It’s amazing how many techniques have fallen into place since wear was an issue in flash SSDs. Today’s software understands which loads have high write levels and which don’t, and different SSDs can be used to handle the two different kinds of load. On the SSD side, flash endurance has continually degraded with the adoption of MLC, TLC, and QLC, and is sometimes measured in the hundreds of cycles. What this implies is that any emerging memory can get by with an equally low endurance as long as it’s put behind the right controller.

In high-speed environments this isn’t a solution, though, since controllers add latency, so “Near Memory” (the memory tied directly to the processor’s memory bus) will need to have higher endurance. Still, an area that can help to accommodate that is the practice of putting code into memories that have low endurance and data into higher-endurance memory (which today would be DRAM). Since emerging memories can provide more bits at a lower cost and power than DRAM, the write load to the code space should be lower, since pages will be swapped in and out more frequently. The endurance requirements will depend on this swapping, and I would guess that the lowest-acceptable level would be in the tens of thousands of cycles.

Q: It seems that persistent memory is more of an enterprise benefit rather than a consumer benefit. And consumer acceptance helps the advancement and cost scaling issues. Do you agree? I use SSDs as an example. Once consumers started using them, the advancement and prices came down greatly.

Anything that drives increased volume will help. In most cases any change to large-scale computing works its way down to the PC, so this should happen in time here, too. But today there’s a growing amount of MRAM use in personal fitness monitors, and this will help drive costs down, so initial demand will not exclusively come from enterprise computing. At the same time, the IBM FlashDrive that we mentioned uses MRAM, too, so both enterprise and consumer are already working to simultaneously grow consumption.

Q: The CXL diagram (slide 22 in the PDF) has 2 CXL switches between the CPUs and the memory. How much latency do you expect the switches to add, and how does that change where CXL fits on the array of memory choices from a performance standpoint?

The CXL delay goals are very aggressive, but I am not sure that an exact number has been specified. It’s on the order of 70ns per “Hop,” which can be understood as the delay of going through a switch or a controller. Naturally, software will evolve to work with this, and will move data that has high bandwidth requirements but is less latency-sensitive to more remote areas, while managing the more latency-sensitive data to near memory.

Q: Where can I learn more about the topic of Emerging Memories?

Here are some resources to review

Persistent Memory Summit Presentations
Blog post on the first FRAM, which was made in 1952, making it the first semiconductor memory: FRAM Turns 68
Blog Post on Gordon Moore 1970 PCM paper: Original PCM Article from 1970
Blog post with overview of all the emerging memory technologies: Emerging Memories Today: The Technologies: MRAM, ReRAM, PCM/XPoint, FRAM, etc.

* LLM in a Flash: Efficient Large Language Model Inference with Limited Memory, Kevin Avizalideh, et. al., arXiv:2312.11514 [cs.CL]

Power Efficiency Measurement – Our Experts Make It Clear – Part 2

February 3, 2024February 3, 2024 Eden Kim, Keith Orsak and Chuck Paradon Leave a comment

In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization.

Join us on our journey to better power efficiency as we continue with Part 2: Impact of Workloads on Power Efficiency Measurement. And if you missed Part 1: Key Issues in Power Efficiency Measurement, you can find it here. Bookmark this blog and check back in March and April for the continuation of our four-part series. And explore the topic further in the SNIA Green Storage Knowledge Center.

Part 2: Impact of Workloads on Power Efficiency Measurement

Workloads are a significant driving force behind power consumption in computing systems. Different tasks and applications place diverse demands on hardware, leading to fluctuations in the amount of power used. Here’s a breakdown of how workloads can influence power consumption:

CPU Utilization. The CPU’s power consumption increases as it processes tasks, with more demanding workloads that involve complex calculations or multitasking leading to higher CPU utilization and, consequently, elevated power usage.
Memory Access is another key factor. Accessing memory modules consumes power, and workloads that heavily rely on frequent memory read and write operations can significantly contribute to increased power consumption.
Disk Activity, particularly read and write operations on storage devices (whether HDDs or SSDs), consumes power. Workloads that involve frequent data access or large file transfers can lead to an uptick in power consumption. GPU Usage plays a crucial role, especially in tasks like gaming, video editing, and machine learning. High GPU utilization for rendering complex graphics or training deep neural networks can result in substantial power consumption.
Network Communication tasks, such as data transfers, streaming, or online gaming, require power from both the CPU and the network interface. The extent of communication and data throughput can significantly affect overall power usage.
In devices equipped with displays, Screen Brightness directly impacts power consumption. Brighter screens consume more power, which means workloads involving continuous display usage contribute to higher power consumption.
I/O Operations encompass interactions with peripherals like storage devices or printers. These operations can lead to short bursts of power consumption, especially if multiple devices are connected.
Understanding the contrast between Idle and Active States is essential. Different workloads can transition devices between these states, with idle periods generally exhibiting lower power consumption. However, certain workloads may keep components active even during seemingly idle times.
Dynamic Voltage and Frequency Scaling are prevalent in many systems, allowing them to adjust the voltage and frequency of components based on workload demands. Increased demand leads to higher clock speeds and voltage, ultimately resulting in more significant power consumption.
Background Processes also come into play. Background applications, updates, and system maintenance tasks can impact power consumption, even when the user isn’t actively engaging with the device.

In practical terms, comprehending how various workloads affect power consumption is vital for optimizing energy efficiency. For instance, laptops can extend their battery life by reducing screen brightness, closing unnecessary applications, and selecting power-saving modes.

Moreover, SSDs are designed with optimizations for background processes in mind. Garbage collection and NAND Flash cell management often occur during idle periods or periods of low-impact workloads.

Likewise, data centers and cloud providers strategically manage workloads to minimize energy consumption and operational costs while upholding performance standards.

Power Efficiency Measurement – Our Experts Make It Clear – Part 1

January 2, 2024January 2, 2024 Eden Kim, Keith Orsak and Chuck Paradon Leave a comment

In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization.

Join us on our journey to better power efficiency as we begin with Part 1: Key Issues in Power Efficiency Measurement. Bookmark this blog and check back in February, March, and April for the continuation of our four-part series. And explore the topic further in the SNIA Green Storage Knowledge Center.

Part 1: Key Issues in Power Efficiency Measurement

Ensuring accurate and precise power consumption measurements is challenging, especially at the individual device level, where even minor variations can have a significant impact. Achieving reliable data necessitates addressing factors like calibration, sensor quality, and noise reduction.

Furthermore, varying workloads in systems require careful consideration to accurately capture transient power spikes and average power consumption. Modern systems are composed of interconnected components that affect each other’s power consumption, making it difficult to isolate individual component power usage.

The act of measuring power itself consumes energy, creating a trade-off between measurement accuracy and the disturbance caused by measurement equipment. To address this, it’s important to minimize measurement overheads while still obtaining meaningful data.

Environmental factors such as temperature, humidity, and airflow, can unpredictably influence power consumption, emphasizing the need for standardized test environments. Rapid workload changes can lead to transient power behavior that may require specialized equipment for accurate measurement.

Software running on a system significantly influences power consumption, emphasizing the importance of selecting representative workloads and ensuring consistent software setups across measurements. Dynamic voltage and frequency scaling are used in many systems to optimize power consumption, and understanding their effects under different conditions is crucial.

Correctly interpreting raw power consumption data is essential to draw meaningful conclusions about efficiency. This requires statistical analysis and context-specific considerations. Real-world variability, stemming from manufacturing differences, component aging, and user behavior, must also be taken into account in realistic assessments.

Addressing these challenges necessitates a combination of precise measurement equipment, thoughtful experimental design, and a deep understanding of the system and device being investigated.

In our next blog, Part 2, we will examine the impact of workloads on power efficiency measurement.

Open Standards Featured at FMS 2023

July 31, 2023July 31, 2023 SNIA CMSI Leave a comment

SNIA welcomes colleagues to join them at the upcoming Flash Memory Summit, August 8-10, 2023 in Santa Clara CA.

SNIA is pleased to join standards organizations CXL Consortium™ (CXL™), PCI-SIG®, and Universal Chiplet Interconnect Express™ (UCIe™) in an Open Standards Pavilion, Booth #725, in the Exhibit Hall. CMSI will feature SNIA member companies in a computational storage cross industry demo by Intel, MINIO, and Solidigm and a Data Filtering demo by ScaleFlux; a software memory tiering demo by VMware; a persistent memory workshop and hackathon; and the latest on SSD form factors E1 and E3 work by SNIA SFF TA Technical work group. SMI will showcase SNIA Swordfish® management of NVMe SSDs on Linux with demos by Intel Samsung and Solidigm.

CXL will discuss their advances in coherent connectivity. PCI-SIG will feature their PCIe 5.0 architecture (32GT/s) and PCIe 6.0 (65GT/s) architectures and industry adoption and the upcoming PCIe 7.0 specification development (128GT/s). UCIe will discuss their new open industry standard establishing a universal interconnect at the package-level.

SNIA STA Forum will also be in Booth #849 – learn more about the SCSI Trade Association joining SNIA.

These demonstrations and discussions will augment FMS program sessions in the SNIA-sponsored System Architecture Track on memory, computational storage, CXL, and UCIe standards. A SNIA mainstage session on Wednesday August 9 at 2:10 pm will discuss Trends in Storage and Data: New Directions for Industry Standards.

SNIA colleagues and friends can receive a $100 discount off the 1-, 2-, or 3-day full conference registration by using code SNIA23.

Visit snia.org/fms to learn more about the exciting activities at FMS 2023 and join us there!

So just what is an SSD?

July 20, 2023July 20, 2023 Jonmichael Hands Leave a comment

It seems like an easy enough question, “What is an SSD?” but surprisingly, most of the search results for this get somewhat confused quickly on media, controllers, form factors, storage interfaces, performance, reliability, and different market segments.

The SNIA SSD SIG has spent time demystifying various SSD topics like endurance, form factors, and the different classifications of SSDs – from consumer to enterprise and hyperscale SSDs.

“Solid state drive is a general term that covers many market segments, and the SNIA SSD SIG has developed a new overview of “What is an SSD? ,” said Jonmichael Hands, SNIA SSD Special Interest Group (SIG)Co-Chair. “We are committed to helping make storage technology topics, like endurance and form factors, much easier to understand coming straight from the industry experts defining the specifications.”

The “What is an SSD?” page offers a concise description of what SSDs do, how they perform, how they connect, and also provides a jumping off point for more in-depth clarification of the many aspects of SSDs. It joins an ever-growing category of 20 one-page “What Is?” answers that provide a clear and concise, vendor-neutral definition of often- asked technology terms, a description of what they are, and how each of these technologies work. Check out all the “What Is?” entries at https://www.snia.org/education/what-is

And don’t miss other interest topics from the SNIA SSD SIG, including Total Cost of Ownership Model for Storage and SSD videos and presentations in the SNIA Educational Library.

Your comments and feedback on this page are welcomed. Send them to askcmsi@snia.org.